pwd
'/Users/mohammedkhattab/Downloads/PhD_scProject/scgast_Project'
from IPython.display import Image, display
display(Image(filename='Science.png'))
Introduction¶
- In this notebook we will re-visit the data from the paper titled Single-cell transcriptomic characterization of a gastrulating human embryo Tyser et al. (2021), where we will explore the data of the 1195 single cell from embryonic stage Carnegie Stage 7 (CS7) around the third week after fertilization, explore the present cell types and plot them in UMAP plots (Uniform Manifold Approximation and Projection), eventually, we will use the R package infercnv to identify any type of copy number variations in comparison to the most basal cell type in the sample which is the Epiblast, and see how single cell data can tell us about it.¶
Chapter 1: Packages import and data upload¶
import numpy as np
import pandas as pd
import anndata as ad
import pyreadr
import scanpy as sc
import urllib.request
from pathlib import Path
import os
import matplotlib.pyplot as plt
from scipy.sparse import csr_matrix
import scarches as sca
import scanorama
from scipy.io import mmwrite
import anndata2ri
from rpy2.robjects import conversion
from rpy2.robjects.conversion import localconverter
with localconverter(anndata2ri.converter):
%reload_ext rpy2.ipython
sc.settings.verbosity = 3
sc.logging.print_header()
sc.settings.set_figure_params(dpi=80, facecolor="white")
R libraries for later¶
%%R
library(reticulate)
library(Seurat)
library(tidyverse)
library(R.utils)
library(devtools)
library(tidyverse)
library(Matrix)
library(infercnv)
# Load .rds file provided by the authors that contains the expression values
raw_reads = pyreadr.read_r("raw_reads.rds")
print(raw_reads.keys())
# Extract the expression values from the .rds object
raw_reads = raw_reads[None]
print(raw_reads.shape) #(1195, 57490)
raw_reads.head(2)
UMAP file upload¶
- raw_reads lacks the cell names, luckily they are provided in another object by the authors called umap, this dataframe contains some useful data, but most importantly it contains the same order of cell names as the raw_reads object, enabling the export of the cell names to the counts object.¶
# Load .rds file 'umap'
umap = pyreadr.read_r("umap.rds")
# Inspect keys
print(umap.keys()) # odict_keys([None])
umap = umap[None]
print(umap.shape) # (1195, 6)
odict_keys([None]) (1195, 6)
#make cell names the index for the `umap`object
umap = umap.set_index("cell_name")
umap.head(2)
| X | X0 | X1 | cluster_id | sub_cluster | |
|---|---|---|---|---|---|
| cell_name | |||||
| SS.sc7785290 | 0 | 12.213498 | -0.550328 | Hemogenic Endothelial Progenitors | Blood Progenitors |
| SS.sc7786612 | 1 | 2.404149 | -7.389468 | Endoderm | DE(P) |
#Now we can add the cell names to the raw_reads object using set_index() function
raw_reads_indexed = raw_reads.set_index(umap.index)
#raw_reads_indexed
umap.index == raw_reads_indexed.index
Metadata upload¶
This 'metadata' file E-MTAB-9388.sdrf.txt is another file contains some important biological data, can be downloaded from here to continue the analysis if the file not already present.¶
Note: The file E-MTAB-9388.idf.txt found in the same link above is not necessary for this analysis.¶
# metadata upload
metadata = pd.read_csv("E-MTAB-9388.zip", sep= "\t")
print(metadata.shape) #After investigating the dataframe, it seems that each cell is repeated once.
(2390, 41)
#We will remove the duplicates using drop_duplicates() function
metadata_clear = metadata.drop_duplicates(subset="Source Name")
#In order to unify the cell names we will change "_" to "." to match the `umap` dataframe we uploaded eariler
metadata_clear.loc[:, "Source Name"] = metadata_clear["Source Name"].str.replace("_", ".", regex=False)
#Make it an Index
metadata_clear = metadata_clear.set_index("Source Name")
# Reindex the umap DataFrame to match the metadata_clear index
umap = umap.reindex(metadata_clear.index)
# Add the annotation column `cluster_id`for later
metadata_clear["cluster_id"] = umap["cluster_id"]
Chapter 2: Create AnnData Object¶
# Build AnnData object
adata = ad.AnnData(
X= raw_reads_indexed.values, # expression matrix
obs= metadata_clear, # cell metadata
var= pd.DataFrame(index=raw_reads_indexed.columns) # gene metadata
)
print(adata) # should print: AnnData object with n_obs × n_vars = 1195 × 57490
AnnData object with n_obs × n_vars = 1195 × 57490
obs: 'Comment[ENA_SAMPLE]', 'Comment[BioSD_SAMPLE]', 'Characteristics[organism]', 'Characteristics[developmental stage]', 'Characteristics[age]', 'Unit[time unit]', 'Characteristics[individual]', 'Characteristics[sex]', 'Characteristics[organism part]', 'Characteristics[sampling site]', 'Characteristics[inferred cell type - authors labels]', 'Characteristics[inferred cell type - ontology labels]', 'Material Type', 'Protocol REF', 'Protocol REF.1', 'Protocol REF.2', 'Extract Name', 'Comment[LIBRARY_LAYOUT]', 'Comment[LIBRARY_SELECTION]', 'Comment[LIBRARY_SOURCE]', 'Comment[LIBRARY_STRATEGY]', 'Comment[NOMINAL_LENGTH]', 'Comment[NOMINAL_SDEV]', 'Comment[end bias]', 'Comment[input molecule]', 'Comment[library construction]', 'Comment[primer]', 'Comment[single cell isolation]', 'Comment[spike in]', 'Protocol REF.3', 'Performer', 'Assay Name', 'Technology Type', 'Comment[ENA_EXPERIMENT]', 'Scan Name', 'Comment[SUBMITTED_FILE_NAME]', 'Comment[ENA_RUN]', 'Comment[FASTQ_URI]', 'Factor Value[single cell identifier]', 'Factor Value[inferred cell type - ontology labels]', 'cluster_id'
AnnData object with n_obs × n_vars = 1195 × 57490 obs: 'Comment[ENA_SAMPLE]', 'Comment[BioSD_SAMPLE]', 'Characteristics[organism]', 'Characteristics[developmental stage]', 'Characteristics[age]', 'Unit[time unit]', 'Characteristics[individual]', 'Characteristics[sex]', 'Characteristics[organism part]', 'Characteristics[sampling site]', 'Characteristics[inferred cell type - authors labels]', 'Characteristics[inferred cell type - ontology labels]', 'Material Type', 'Protocol REF', 'Protocol REF.1', 'Protocol REF.2', 'Extract Name', 'Comment[LIBRARY_LAYOUT]', 'Comment[LIBRARY_SELECTION]', 'Comment[LIBRARY_SOURCE]', 'Comment[LIBRARY_STRATEGY]', 'Comment[NOMINAL_LENGTH]', 'Comment[NOMINAL_SDEV]', 'Comment[end bias]', 'Comment[input molecule]', 'Comment[library construction]', 'Comment[primer]', 'Comment[single cell isolation]', 'Comment[spike in]', 'Protocol REF.3', 'Performer', 'Assay Name', 'Technology Type', 'Comment[ENA_EXPERIMENT]', 'Scan Name', 'Comment[SUBMITTED_FILE_NAME]', 'Comment[ENA_RUN]', 'Comment[FASTQ_URI]', 'Factor Value[single cell identifier]', 'Factor Value[inferred cell type - ontology labels]', 'cluster_id'
# Start the standard workflow for Anndata objects
adata.var_names_make_unique()
sc.pp.calculate_qc_metrics(adata, inplace=True)
#Check the distribution of counts across cells
sc.pl.violin(adata,['total_counts', 'n_genes_by_counts'], multi_panel= True)
The violin plot tell us that our cell population doesn't require further cutting, since the cells are well distributed, no outliers and to maintain as much cells as possible for the further analysis.¶
# Normalizing to median total counts
sc.pp.normalize_total(adata)
sc.pp.log1p(adata)
normalizing counts per cell
finished (0:00:00)
sc.pl.highest_expr_genes(adata, n_top = 10, )
normalizing counts per cell
finished (0:00:00)
# Perform the standard scaling and dimentionality reduction workflow
sc.pp.scale(adata, max_value=10)
sc.tl.pca(adata, n_comps=30, random_state=42)
sc.pp.neighbors(adata, n_neighbors=10, n_pcs=30, random_state=42)
sc.tl.umap(adata, random_state=42)
# Annotate highly variable genes in the `adata`object
sc.pp.highly_variable_genes(adata)
sc.pl.highly_variable_genes(adata)
#Perform leiden clustering with resolution = 0.75
sc.tl.leiden(adata, resolution=0.75, flavor="igraph", n_iterations=2, random_state=42)
fig, axs = plt.subplots(1, 2, figsize=(10, 5), dpi=120)
sc.pl.umap(adata, color="leiden", ax=axs[0], show=False)
sc.pl.umap(adata, color="pct_counts_in_top_50_genes", ax=axs[1], show=False)
plt.show()
adata.obs
| Comment[ENA_SAMPLE] | Comment[BioSD_SAMPLE] | Characteristics[organism] | Characteristics[developmental stage] | Characteristics[age] | Unit[time unit] | Characteristics[individual] | Characteristics[sex] | Characteristics[organism part] | Characteristics[sampling site] | ... | n_genes_by_counts | log1p_n_genes_by_counts | total_counts | log1p_total_counts | pct_counts_in_top_50_genes | pct_counts_in_top_100_genes | pct_counts_in_top_200_genes | pct_counts_in_top_500_genes | leiden | AnnotatedCluster | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Source Name | |||||||||||||||||||||
| SS.sc7785278 | ERS5181934 | SAMEA7423586 | Homo sapiens | embryo | 16 to 19 | day | CS7 | male | whole organism | yolk sac | ... | 4922 | 8.501673 | 377151.968991 | 12.840406 | 23.622851 | 30.429528 | 39.508748 | 55.256619 | 0 | HEP |
| SS.sc7785279 | ERS5181935 | SAMEA7423587 | Homo sapiens | embryo | 16 to 19 | day | CS7 | male | whole organism | yolk sac | ... | 6942 | 8.845489 | 259888.990001 | 12.468014 | 18.311227 | 25.105456 | 33.482184 | 48.358286 | 1 | Primative Streak |
| SS.sc7785280 | ERS5181936 | SAMEA7423588 | Homo sapiens | embryo | 16 to 19 | day | CS7 | male | whole organism | yolk sac | ... | 6140 | 8.722743 | 437911.014986 | 12.989773 | 16.438116 | 23.168028 | 31.973973 | 48.713081 | 2 | Emergent Mesoderm |
| SS.sc7785281 | ERS5181937 | SAMEA7423589 | Homo sapiens | embryo | 16 to 19 | day | CS7 | male | whole organism | yolk sac | ... | 3800 | 8.243019 | 322351.983054 | 12.683402 | 18.344805 | 27.028758 | 38.095824 | 57.775552 | 3 | Nascent Mesoderm |
| SS.sc7785282 | ERS5181938 | SAMEA7423590 | Homo sapiens | embryo | 16 to 19 | day | CS7 | male | whole organism | yolk sac | ... | 2964 | 7.994632 | 394318.996000 | 12.884918 | 25.350752 | 33.529113 | 43.988398 | 62.262868 | 4 | Extra-embryonic Mesoderm |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| SS.sc7788385 | ERS5183125 | SAMEA7424905 | Homo sapiens | embryo | 16 to 19 | day | CS7 | male | whole organism | caudal | ... | 6361 | 8.758098 | 226569.027995 | 12.330809 | 17.380574 | 24.134041 | 32.566881 | 48.247448 | 7 | Advanced Mesoderm |
| SS.sc7788387 | ERS5183126 | SAMEA7424906 | Homo sapiens | embryo | 16 to 19 | day | CS7 | male | whole organism | caudal | ... | 4647 | 8.444192 | 353921.980998 | 12.776835 | 17.275106 | 24.978208 | 35.842762 | 54.734178 | 3 | Nascent Mesoderm |
| SS.sc7788388 | ERS5183127 | SAMEA7424907 | Homo sapiens | embryo | 16 to 19 | day | CS7 | male | whole organism | caudal | ... | 2822 | 7.945555 | 366998.009999 | 12.813114 | 19.084758 | 28.550866 | 41.422998 | 63.511382 | 3 | Nascent Mesoderm |
| SS.sc7788390 | ERS5183128 | SAMEA7424908 | Homo sapiens | embryo | 16 to 19 | day | CS7 | male | whole organism | caudal | ... | 6110 | 8.717846 | 459601.642974 | 13.038118 | 19.059282 | 26.192770 | 35.600091 | 51.565122 | 7 | Advanced Mesoderm |
| SS.sc7788391 | ERS5183129 | SAMEA7424909 | Homo sapiens | embryo | 16 to 19 | day | CS7 | male | whole organism | yolk sac | ... | 5612 | 8.632841 | 349231.986990 | 12.763495 | 17.768009 | 24.528315 | 33.272577 | 49.325280 | 9 | Epiblast |
1195 rows × 51 columns
Chapter 3: Cell type annotation¶
# Find marker genes for each of leiden cluster
sc.tl.rank_genes_groups(adata, 'leiden', method='logreg')
sc.pl.rank_genes_groups(adata, n_genes=20, sharey=False)
ranking genes
finished: added to `.uns['rank_genes_groups']`
'names', sorted np.recarray to be indexed by group ids
'scores', sorted np.recarray to be indexed by group ids
(0:00:02)
The following kernel contains a handful of common markers from Supplementary Note 1 - Annotation of gastrula cell types.¶
markers = ["leiden", "GATA1", "TBXT", "MSGN1", "MESP1", "MEF2C",
"LEFTY2", "FOXF1", "HAND1", "FOXA2", "GATA6", "HOXA1",
"CDH1", "FST", "DLX5", "SOX2"]
# Create subplots
fig, axs = plt.subplots(8, 2, figsize=(20, 40), dpi=120)
for i, marker in enumerate(markers):
row = i // 2
col = i % 2
sc.pl.umap(adata, color=marker, ax=axs[row, col], show=False)
plt.show()
Dotplot for the top markers¶
# Extract ranked marker genes
marker_df = sc.get.rank_genes_groups_df(adata, group=None)
top_markers = marker_df.groupby("group", observed=False).head()
sc.pl.dotplot(
adata,
var_names=top_markers['names'].unique().tolist(),
groupby="leiden",
standard_scale="var",
dendrogram=True
)
Identifying main markers in each cluster, thus representing its cell type.¶
marker_genes = {
'HEP': ['SPI1', 'MEF2C'],
'Primative Streak': ['TBXT', 'CDH1' , 'FST'],
'Emergent Mesoderm': ['MESP1', 'LHX1', 'LEFTY2'],
'Nascent Mesoderm': ['TBXT', 'MESP1', 'MSGN1'],
'Extra-embryonic Mesoderm': ['FOXF1', 'HAND1'],
'Axial Mesoderm': ['TBXT', 'FOXA2', 'CDH1'],
'Erythrocytes': ['GATA1', 'HBZ', 'HBE1'],
'Advanced Mesoderm': ['MESP1', 'PDGFRA','BMP4', 'SNAI2', "HAND1", "GATA6"],
'Endoderm': ['SOX17', 'FOXA2', 'CXCR4', 'TMA7'],
'Epiblast' : ['SOX2', 'OTX2', 'CDH1'],
'Ectoderm (Amniotic/Embryonic)' : ['DLX5', 'TFAP2A', 'GATA3'],
'Caudal Ad. M and PS / N. M': "HOXA1"
}
sc.pl.dotplot(
adata,
marker_genes,
groupby="leiden",
standard_scale="var",
dendrogram=True
)
The heatmap above is a little messy but this is should be expected, as lots of cells share multiple important markers, but you ultimately each cell type will end up with its unique set of markers.¶
# Create a mapping dictionary for cell types
cluster2celltype = {
"0" :'HEP',
"1" :'Primative Streak',
"2" :'Emergent Mesoderm',
"3" :'Nascent Mesoderm',
"4" :'Extra-embryonic Mesoderm',
"5" :'Axial Mesoderm',
"6" :'Erythrocytes',
"7" :'Advanced Mesoderm',
"8" :'Endoderm',
"9" :'Epiblast',
"10" :'Caudal Ad. Mesoderm and PS / Nascent Mesoderm',
"11" :'Ectoderm (Amniotic/Embryonic)'
}
# Add a new column with annotations
adata.obs["AnnotatedCluster"] = adata.obs["leiden"].map(cluster2celltype)
sc.tl.leiden(adata, resolution= 0.75, flavor="igraph", n_iterations=2)
fig, axs = plt.subplots(1, 2, figsize=(10,3), dpi=220)
sc.pl.umap(adata, color="leiden", ax=axs[0], show=False)
sc.pl.umap(adata, color="AnnotatedCluster", ax=axs[1], show=False)
plt.show()
running Leiden clustering
finished: found 14 clusters and added
'leiden', the cluster labels (adata.obs, categorical) (0:00:00)
Fig. 1c from reference paper for comparison.¶
display(Image(filename='Fig1c.png', width=700, height=300))
Optional step: h5ad to seurat¶
If wanted save a version of the AnnData object¶
adata.write("adata_obj.h5ad")
Chapter 4: InferCNV for copy number variations analysis¶
Create metadata file¶
#add metadata
adata.obs.to_csv("metadata.csv")
%%R
# read the metadata in R
metadata <- read_csv("metadata.csv")
metadata
Rows: 1195 Columns: 52 ── Column specification ──────────────────────────────────────────────────────── Delimiter: "," chr (41): Source Name, Comment[ENA_SAMPLE], Comment[BioSD_SAMPLE], Character... dbl (11): Comment[NOMINAL_LENGTH], Comment[NOMINAL_SDEV], n_genes_by_counts,... ℹ Use `spec()` to retrieve the full column specification for this data. ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message. # A tibble: 1,195 × 52 `Source Name` `Comment[ENA_SAMPLE]` `Comment[BioSD_SAMPLE]` <chr> <chr> <chr> 1 SS.sc7785278 ERS5181934 SAMEA7423586 2 SS.sc7785279 ERS5181935 SAMEA7423587 3 SS.sc7785280 ERS5181936 SAMEA7423588 4 SS.sc7785281 ERS5181937 SAMEA7423589 5 SS.sc7785282 ERS5181938 SAMEA7423590 6 SS.sc7785283 ERS5181939 SAMEA7423591 7 SS.sc7785286 ERS5181940 SAMEA7423592 8 SS.sc7785288 ERS5181941 SAMEA7423593 9 SS.sc7785289 ERS5181942 SAMEA7423594 10 SS.sc7785290 ERS5181943 SAMEA7423595 # ℹ 1,185 more rows # ℹ 49 more variables: `Characteristics[organism]` <chr>, # `Characteristics[developmental stage]` <chr>, `Characteristics[age]` <chr>, # `Unit[time unit]` <chr>, `Characteristics[individual]` <chr>, # `Characteristics[sex]` <chr>, `Characteristics[organism part]` <chr>, # `Characteristics[sampling site]` <chr>, # `Characteristics[inferred cell type - authors labels]` <chr>, … # ℹ Use `print(n = ...)` to see more rows
Create raw_counts.tsv¶
Seurat handel the counts when the genes represented as rownames.¶
#We have to transpose the matrix.
raw_reads_indexed_transposed = raw_reads_indexed.T
Create raw_counts and cell cell-type annotation file for inferCNV¶
annotation_file = adata.obs[['AnnotatedCluster']].copy()
#annotation_file
#Save raw_counts as raw_counts.tsv
raw_reads_indexed_transposed.to_csv("raw_counts.tsv", sep="\t")
#Save cell cell-type annotation file as cell_annotations.tsv
annotation_file.to_csv("cell_annotations.tsv", sep="\t", header=False)
#raw_reads_indexed_transposed
Re-upload the data in R as tsv files¶
%%R
raw_counts <- read.table("raw_counts.tsv", sep="\t", header=TRUE, row.names=1, check.names=FALSE)
head(rownames(raw_counts))
[1] "A1BG" "A1BG.AS1" "A1CF" "A2M" "A2M.AS1" "A2ML1"
%%R
annotation_file <- read_tsv("cell_annotations.tsv")
annotation_file
Rows: 1194 Columns: 2 ── Column specification ──────────────────────────────────────────────────────── Delimiter: "\t" chr (2): SS.sc7785278, HEP ℹ Use `spec()` to retrieve the full column specification for this data. ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message. # A tibble: 1,194 × 2 SS.sc7785278 HEP <chr> <chr> 1 SS.sc7785279 Primative Streak 2 SS.sc7785280 Nascent Mesoderm 3 SS.sc7785281 Extra-embryonic Mesoderm 4 SS.sc7785282 Axial Mesoderm 5 SS.sc7785283 Emergent Mesoderm 6 SS.sc7785286 HEP 7 SS.sc7785288 Nascent Mesoderm 8 SS.sc7785289 Primative Streak 9 SS.sc7785290 Extra-embryonic Mesoderm 10 SS.sc7785291 Axial Mesoderm # ℹ 1,184 more rows # ℹ Use `print(n = ...)` to see more rows
%%R
#Gene order file found online
gene_order <- read_tsv("gencode_v19_gene_pos.txt")
gene_order
Rows: 55764 Columns: 4 ── Column specification ──────────────────────────────────────────────────────── Delimiter: "\t" chr (2): DDX11L1, chr1 dbl (2): 11869, 14412 ℹ Use `spec()` to retrieve the full column specification for this data. ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message. # A tibble: 55,764 × 4 DDX11L1 chr1 `11869` `14412` <chr> <chr> <dbl> <dbl> 1 WASH7P chr1 14363 29806 2 MIR1302-11 chr1 29554 31109 3 FAM138A chr1 34554 36081 4 OR4G4P chr1 52473 54936 5 OR4G11P chr1 62948 63887 6 OR4F5 chr1 69091 70008 7 RP11-34P13.7 chr1 89295 133566 8 RP11-34P13.8 chr1 89551 91105 9 CICP27 chr1 131025 134836 10 AL627309.1 chr1 134901 139379 # ℹ 55,754 more rows # ℹ Use `print(n = ...)` to see more rows
%%R
#Add column names for the file
colnames(gene_order) <- c("gene","chr","start","end")
%%R
# How many overlap
length(intersect(rownames(raw_counts), gene_order$gene))
[1] 30222
%%R
# Optional: save clean version for inferCNV (no header, 4 cols)
write.table(gene_order, "gene_order.tsv",
sep="\t", quote=FALSE, row.names=FALSE, col.names=FALSE)
%%R
#make sure the gene_order have same genes as in raw_counts to prevent errors
gene_order_updated <- subset(gene_order, gene %in% rownames(raw_counts))
gene_order_updated <- gene_order_updated[match(rownames(raw_counts), gene_order_updated$gene), ]
# Remove any rows with newly emergent NA values from the subsetting step
gene_order_updated <- na.omit(gene_order_updated)
dim(gene_order_updated)
[1] 30222 4
Refine the genes present, their intersection with raw_counts and their order.¶
%%R
# Keep only genes that exist in gene_order
common_genes <- intersect(rownames(raw_counts), gene_order_updated$gene)
# Subset expression matrix
raw_counts <- raw_counts[common_genes, ]
# Subset gene order and keep the same order as counts
gene_order_final <- gene_order_updated[match(rownames(raw_counts), gene_order_updated$gene), ]
%%R
# check
all(rownames(raw_counts) == gene_order_final$gene)
[1] TRUE
%%R
# Save a clean version for inferCNV (no header, 4 cols)
write.table(gene_order, "gene_order_final.tsv",
sep="\t", quote=FALSE, row.names=FALSE, col.names=FALSE)
InferCNV¶
%%R
infercnv_obj <- CreateInfercnvObject(raw_counts_matrix = as.matrix(raw_counts),
annotations_file = "cell_annotations.tsv",
gene_order_file = "gene_order_final.tsv",
delim = "\t",
ref_group_names = "Epiblast" )
INFO [2025-10-08 15:46:07] Parsing gene order file: gene_order_final.tsv INFO [2025-10-08 15:46:07] Parsing cell annotations file: cell_annotations.tsv INFO [2025-10-08 15:46:07] ::order_reduce:Start. INFO [2025-10-08 15:46:07] .order_reduce(): expr and order match. INFO [2025-10-08 15:46:07] ::process_data:order_reduce:Reduction from positional data, new dimensions (r,c) = 30222,1195 Total=-174820.018278072 Min=-10 Max=10. INFO [2025-10-08 15:46:07] num genes removed taking into account provided gene ordering list: 1657 = 5.4827609026537% removed. INFO [2025-10-08 15:46:07] -filtering out cells < 100 or > Inf, removing 51.8828 % of cells WARN [2025-10-08 15:46:07] Please use "options(scipen = 100)" before running infercnv if you are using the analysis_mode="subclusters" option or you may encounter an error while the hclust is being generated. INFO [2025-10-08 15:46:09] validating infercnv_obj
%%R
infercnv_obj_run <- infercnv::run(
infercnv_obj,
out_dir = "output_dir",
cutoff = 0, # instead of 1, keeps more genes
min_cells_per_gene = 5, # relax cell filter
HMM = T,
per_chr_hmm_subclusters=TRUE,
HMM_type="i3",
analysis_mode="subclusters", #inferCNV will attempt to find subpopulations with distinct CNV patterns, rather than assuming each provided group is uniform
denoise = T)
INFO [2025-10-08 15:46:12] ::process_data:Start INFO [2025-10-08 15:46:12] Creating output path output_dir INFO [2025-10-08 15:46:12] Checking for saved results. INFO [2025-10-08 15:46:12] STEP 1: incoming data INFO [2025-10-08 15:46:19] STEP 02: Removing lowly expressed genes INFO [2025-10-08 15:46:19] ::above_min_mean_expr_cutoff:Start INFO [2025-10-08 15:46:19] Removing 6708 genes from matrix as below mean expr threshold: 0 INFO [2025-10-08 15:46:19] validating infercnv_obj INFO [2025-10-08 15:46:19] There are 21857 genes and 575 cells remaining in the expr matrix. INFO [2025-10-08 15:46:20] Removed 7850 genes having fewer than 5 min cells per gene = 35.9153 % genes removed here INFO [2025-10-08 15:46:20] validating infercnv_obj INFO [2025-10-08 15:46:24] STEP 03: normalization by sequencing depth INFO [2025-10-08 15:46:24] normalizing counts matrix by depth INFO [2025-10-08 15:46:24] Computed total sum normalization factor as median libsize: 1565.062421 INFO [2025-10-08 15:46:28] STEP 04: log transformation of data INFO [2025-10-08 15:46:28] transforming log2xplus1() INFO [2025-10-08 15:46:32] STEP 08: removing average of reference data (before smoothing) INFO [2025-10-08 15:46:32] ::subtract_ref_expr_from_obs:Start inv_log=FALSE, use_bounds=TRUE INFO [2025-10-08 15:46:32] subtracting mean(normal) per gene per cell across all data INFO [2025-10-08 15:46:34] -subtracting expr per gene, use_bounds=TRUE INFO [2025-10-08 15:46:38] STEP 09: apply max centered expression threshold: 3 INFO [2025-10-08 15:46:38] ::process_data:setting max centered expr, threshold set to: +/-: 3 INFO [2025-10-08 15:46:41] STEP 10: Smoothing data per cell by chromosome INFO [2025-10-08 15:46:41] smooth_by_chromosome: chr: chr1 INFO [2025-10-08 15:46:42] smooth_by_chromosome: chr: chr2 INFO [2025-10-08 15:46:42] smooth_by_chromosome: chr: chr3 INFO [2025-10-08 15:46:42] smooth_by_chromosome: chr: chr4 INFO [2025-10-08 15:46:43] smooth_by_chromosome: chr: chr5 INFO [2025-10-08 15:46:43] smooth_by_chromosome: chr: chr6 INFO [2025-10-08 15:46:44] smooth_by_chromosome: chr: chr7 INFO [2025-10-08 15:46:44] smooth_by_chromosome: chr: chr8 INFO [2025-10-08 15:46:45] smooth_by_chromosome: chr: chr9 INFO [2025-10-08 15:46:45] smooth_by_chromosome: chr: chr10 INFO [2025-10-08 15:46:46] smooth_by_chromosome: chr: chr11 INFO [2025-10-08 15:46:46] smooth_by_chromosome: chr: chr12 INFO [2025-10-08 15:46:46] smooth_by_chromosome: chr: chr13 INFO [2025-10-08 15:46:47] smooth_by_chromosome: chr: chr14 INFO [2025-10-08 15:46:48] smooth_by_chromosome: chr: chr15 INFO [2025-10-08 15:46:48] smooth_by_chromosome: chr: chr16 INFO [2025-10-08 15:46:48] smooth_by_chromosome: chr: chr17 INFO [2025-10-08 15:46:49] smooth_by_chromosome: chr: chr18 INFO [2025-10-08 15:46:50] smooth_by_chromosome: chr: chr19 INFO [2025-10-08 15:46:50] smooth_by_chromosome: chr: chr20 INFO [2025-10-08 15:46:50] smooth_by_chromosome: chr: chr21 INFO [2025-10-08 15:46:50] smooth_by_chromosome: chr: chr22 INFO [2025-10-08 15:46:56] STEP 11: re-centering data across chromosome after smoothing INFO [2025-10-08 15:46:56] ::center_smooth across chromosomes per cell INFO [2025-10-08 15:47:01] STEP 12: removing average of reference data (after smoothing) INFO [2025-10-08 15:47:01] ::subtract_ref_expr_from_obs:Start inv_log=FALSE, use_bounds=TRUE INFO [2025-10-08 15:47:01] subtracting mean(normal) per gene per cell across all data INFO [2025-10-08 15:47:03] -subtracting expr per gene, use_bounds=TRUE INFO [2025-10-08 15:47:08] STEP 14: invert log2(FC) to FC INFO [2025-10-08 15:47:08] invert_log2(), computing 2^x INFO [2025-10-08 15:47:16] STEP 15: computing tumor subclusters via leiden INFO [2025-10-08 15:47:16] define_signif_tumor_subclusters(p_val=0.1 INFO [2025-10-08 15:47:16] define_signif_tumor_subclusters(), tumor: Advanced Mesoderm INFO [2025-10-08 15:47:16] Setting auto leiden resolution for Advanced Mesoderm to 0.3038 INFO [2025-10-08 15:47:17] define_signif_tumor_subclusters(), tumor: Axial Mesoderm INFO [2025-10-08 15:47:17] Setting auto leiden resolution for Axial Mesoderm to 0.379855 INFO [2025-10-08 15:47:18] define_signif_tumor_subclusters(), tumor: Emergent Mesoderm INFO [2025-10-08 15:47:18] Less cells in group Emergent Mesoderm than k_nn setting. Keeping as a single subcluster. INFO [2025-10-08 15:47:18] define_signif_tumor_subclusters(), tumor: Endoderm INFO [2025-10-08 15:47:18] Setting auto leiden resolution for Endoderm to 0.321105 INFO [2025-10-08 15:47:19] define_signif_tumor_subclusters(), tumor: Erythrocytes INFO [2025-10-08 15:47:19] Setting auto leiden resolution for Erythrocytes to 0.217096 INFO [2025-10-08 15:47:20] define_signif_tumor_subclusters(), tumor: Extra-embryonic Mesoderm INFO [2025-10-08 15:47:20] Setting auto leiden resolution for Extra-embryonic Mesoderm to 0.172198 INFO [2025-10-08 15:47:21] define_signif_tumor_subclusters(), tumor: HEP INFO [2025-10-08 15:47:21] Setting auto leiden resolution for HEP to 0.333884 INFO [2025-10-08 15:47:23] define_signif_tumor_subclusters(), tumor: Nascent Mesoderm INFO [2025-10-08 15:47:23] Setting auto leiden resolution for Nascent Mesoderm to 0.184162 INFO [2025-10-08 15:47:24] define_signif_tumor_subclusters(), tumor: Primative Streak INFO [2025-10-08 15:47:24] Setting auto leiden resolution for Primative Streak to 0.231139 INFO [2025-10-08 15:47:25] define_signif_tumor_subclusters(), tumor: Epiblast INFO [2025-10-08 15:47:25] Setting auto leiden resolution for Epiblast to 0.202492 INFO [2025-10-08 15:47:26] Less cells in group Emergent Mesoderm than k_nn setting. Keeping as a single per chr subcluster. INFO [2025-10-08 15:47:26] Less cells in group Emergent Mesoderm than k_nn setting. Keeping as a single per chr subcluster. INFO [2025-10-08 15:47:26] Less cells in group Emergent Mesoderm than k_nn setting. Keeping as a single per chr subcluster. INFO [2025-10-08 15:47:27] Less cells in group Emergent Mesoderm than k_nn setting. Keeping as a single per chr subcluster. INFO [2025-10-08 15:47:27] Less cells in group Emergent Mesoderm than k_nn setting. Keeping as a single per chr subcluster. INFO [2025-10-08 15:47:27] Less cells in group Emergent Mesoderm than k_nn setting. Keeping as a single per chr subcluster. INFO [2025-10-08 15:47:27] Less cells in group Emergent Mesoderm than k_nn setting. Keeping as a single per chr subcluster. INFO [2025-10-08 15:47:27] Less cells in group Emergent Mesoderm than k_nn setting. Keeping as a single per chr subcluster. INFO [2025-10-08 15:47:27] Less cells in group Emergent Mesoderm than k_nn setting. Keeping as a single per chr subcluster. INFO [2025-10-08 15:47:27] Less cells in group Emergent Mesoderm than k_nn setting. Keeping as a single per chr subcluster. INFO [2025-10-08 15:47:27] Less cells in group Emergent Mesoderm than k_nn setting. Keeping as a single per chr subcluster. INFO [2025-10-08 15:47:27] Less cells in group Emergent Mesoderm than k_nn setting. Keeping as a single per chr subcluster. INFO [2025-10-08 15:47:27] Less cells in group Emergent Mesoderm than k_nn setting. Keeping as a single per chr subcluster. INFO [2025-10-08 15:47:27] Less cells in group Emergent Mesoderm than k_nn setting. Keeping as a single per chr subcluster. INFO [2025-10-08 15:47:27] Less cells in group Emergent Mesoderm than k_nn setting. Keeping as a single per chr subcluster. INFO [2025-10-08 15:47:27] Less cells in group Emergent Mesoderm than k_nn setting. Keeping as a single per chr subcluster. INFO [2025-10-08 15:47:27] Less cells in group Emergent Mesoderm than k_nn setting. Keeping as a single per chr subcluster. INFO [2025-10-08 15:47:27] Less cells in group Emergent Mesoderm than k_nn setting. Keeping as a single per chr subcluster. INFO [2025-10-08 15:47:27] Less cells in group Emergent Mesoderm than k_nn setting. Keeping as a single per chr subcluster. INFO [2025-10-08 15:47:28] Less cells in group Emergent Mesoderm than k_nn setting. Keeping as a single per chr subcluster. INFO [2025-10-08 15:47:28] Less cells in group Emergent Mesoderm than k_nn setting. Keeping as a single per chr subcluster. INFO [2025-10-08 15:47:28] Less cells in group Emergent Mesoderm than k_nn setting. Keeping as a single per chr subcluster. INFO [2025-10-08 15:47:35] ::plot_cnv:Start INFO [2025-10-08 15:47:35] ::plot_cnv:Current data dimensions (r,c)=14007,575 Total=8056179.15098159 Min=0.827673338207976 Max=1.15777732284435. INFO [2025-10-08 15:47:35] ::plot_cnv:Depending on the size of the matrix this may take a moment. INFO [2025-10-08 15:47:35] plot_cnv(): auto thresholding at: (0.967126 , 1.033409) INFO [2025-10-08 15:47:35] plot_cnv_observation:Start INFO [2025-10-08 15:47:35] Observation data size: Cells= 498 Genes= 14007 INFO [2025-10-08 15:47:35] clustering observations via method: ward.D INFO [2025-10-08 15:47:35] Number of cells in group(1) is 47 INFO [2025-10-08 15:47:35] group size being clustered: 47,14007 INFO [2025-10-08 15:47:35] Number of cells in group(2) is 1 INFO [2025-10-08 15:47:35] Skipping group: 2, since less than 2 entries INFO [2025-10-08 15:47:35] Number of cells in group(3) is 37 INFO [2025-10-08 15:47:35] group size being clustered: 37,14007 INFO [2025-10-08 15:47:35] Number of cells in group(4) is 9 INFO [2025-10-08 15:47:35] group size being clustered: 9,14007 INFO [2025-10-08 15:47:35] Number of cells in group(5) is 45 INFO [2025-10-08 15:47:35] group size being clustered: 45,14007 INFO [2025-10-08 15:47:35] Number of cells in group(6) is 67 INFO [2025-10-08 15:47:36] group size being clustered: 67,14007 INFO [2025-10-08 15:47:36] Number of cells in group(7) is 4 INFO [2025-10-08 15:47:36] group size being clustered: 4,14007 INFO [2025-10-08 15:47:36] Number of cells in group(8) is 70 INFO [2025-10-08 15:47:36] group size being clustered: 70,14007 INFO [2025-10-08 15:47:36] Number of cells in group(9) is 21 INFO [2025-10-08 15:47:36] group size being clustered: 21,14007 INFO [2025-10-08 15:47:36] Number of cells in group(10) is 2 INFO [2025-10-08 15:47:36] group size being clustered: 2,14007 INFO [2025-10-08 15:47:36] Number of cells in group(11) is 43 INFO [2025-10-08 15:47:36] group size being clustered: 43,14007 INFO [2025-10-08 15:47:36] Number of cells in group(12) is 72 INFO [2025-10-08 15:47:36] group size being clustered: 72,14007 INFO [2025-10-08 15:47:36] Number of cells in group(13) is 14 INFO [2025-10-08 15:47:36] group size being clustered: 14,14007 INFO [2025-10-08 15:47:36] Number of cells in group(14) is 58 INFO [2025-10-08 15:47:36] group size being clustered: 58,14007 INFO [2025-10-08 15:47:36] Number of cells in group(15) is 8 INFO [2025-10-08 15:47:36] group size being clustered: 8,14007 INFO [2025-10-08 15:47:36] plot_cnv_observation:Writing observation groupings/color. INFO [2025-10-08 15:47:36] plot_cnv_observation:Done writing observation groupings/color. INFO [2025-10-08 15:47:36] plot_cnv_observation:Writing observation heatmap thresholds. INFO [2025-10-08 15:47:36] plot_cnv_observation:Done writing observation heatmap thresholds. INFO [2025-10-08 15:47:37] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000 INFO [2025-10-08 15:47:37] Quantiles of plotted data range: 0.967125515789634,0.998062798164108,0.99929349444528,1.00072939080162,1.03340940953818 INFO [2025-10-08 15:47:37] plot_cnv_references:Start INFO [2025-10-08 15:47:37] Reference data size: Cells= 77 Genes= 14007 INFO [2025-10-08 15:47:37] plot_cnv_references:Number reference groups= 3 INFO [2025-10-08 15:47:37] plot_cnv_references:Plotting heatmap. INFO [2025-10-08 15:47:37] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000 INFO [2025-10-08 15:47:37] Quantiles of plotted data range: 0.967125515789634,0.998069252504531,0.999297520712723,1.00074408335968,1.03340940953818 INFO [2025-10-08 15:47:45] ::plot_cnv:Start INFO [2025-10-08 15:47:45] ::plot_cnv:Current data dimensions (r,c)=14007,575 Total=8056179.15098159 Min=0.827673338207976 Max=1.15777732284435. INFO [2025-10-08 15:47:45] ::plot_cnv:Depending on the size of the matrix this may take a moment. INFO [2025-10-08 15:47:45] plot_cnv(): auto thresholding at: (0.967126 , 1.033409) INFO [2025-10-08 15:47:45] plot_cnv_observation:Start INFO [2025-10-08 15:47:45] Observation data size: Cells= 498 Genes= 14007 INFO [2025-10-08 15:47:45] plot_cnv_observation:Writing observation groupings/color. INFO [2025-10-08 15:47:45] plot_cnv_observation:Done writing observation groupings/color. INFO [2025-10-08 15:47:45] plot_cnv_observation:Writing observation heatmap thresholds. INFO [2025-10-08 15:47:45] plot_cnv_observation:Done writing observation heatmap thresholds. INFO [2025-10-08 15:47:46] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000 INFO [2025-10-08 15:47:46] Quantiles of plotted data range: 0.967125515789634,0.998062798164108,0.99929349444528,1.00072939080162,1.03340940953818 INFO [2025-10-08 15:47:46] plot_cnv_references:Start INFO [2025-10-08 15:47:46] Reference data size: Cells= 77 Genes= 14007 INFO [2025-10-08 15:47:46] plot_cnv_references:Number reference groups= 1 INFO [2025-10-08 15:47:46] plot_cnv_references:Plotting heatmap. INFO [2025-10-08 15:47:46] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000 INFO [2025-10-08 15:47:46] Quantiles of plotted data range: 0.967125515789634,0.998069252504531,0.999297520712723,1.00074408335968,1.03340940953818 INFO [2025-10-08 15:47:47] STEP 17: HMM-based CNV prediction INFO [2025-10-08 15:47:47] i3HMM_predict_CNV_via_HMM_on_tumor_subclusters(i3_p_val=0.05, use_KS=FALSE) INFO [2025-10-08 15:47:47] determine mean delta (sigma: 0.00875461, p=0.05) -> 0.0144 INFO [2025-10-08 15:47:51] -done predicting CNV based on initial tumor subclusters INFO [2025-10-08 15:47:51] get_predicted_CNV_regions(subcluster) INFO [2025-10-08 15:47:51] -processing cell_group_name: Advanced Mesoderm.Advanced Mesoderm_s1, size: 47 INFO [2025-10-08 15:47:56] -processing cell_group_name: Advanced Mesoderm.Advanced Mesoderm_s2, size: 1 INFO [2025-10-08 15:48:01] -processing cell_group_name: Axial Mesoderm.Axial Mesoderm_s1, size: 37 INFO [2025-10-08 15:48:06] -processing cell_group_name: Emergent Mesoderm.Emergent Mesoderm, size: 9 INFO [2025-10-08 15:48:10] -processing cell_group_name: Endoderm.Endoderm_s1, size: 45 INFO [2025-10-08 15:48:15] -processing cell_group_name: Erythrocytes.Erythrocytes_s1, size: 67 INFO [2025-10-08 15:48:20] -processing cell_group_name: Erythrocytes.Erythrocytes_s2, size: 4 INFO [2025-10-08 15:48:25] -processing cell_group_name: Extra-embryonic Mesoderm.Extra-embryonic Mesoderm_s1, size: 70 INFO [2025-10-08 15:48:30] -processing cell_group_name: Extra-embryonic Mesoderm.Extra-embryonic Mesoderm_s2, size: 21 INFO [2025-10-08 15:48:34] -processing cell_group_name: Extra-embryonic Mesoderm.Extra-embryonic Mesoderm_s3, size: 2 INFO [2025-10-08 15:48:39] -processing cell_group_name: HEP.HEP_s1, size: 43 INFO [2025-10-08 15:48:44] -processing cell_group_name: Nascent Mesoderm.Nascent Mesoderm_s1, size: 72 INFO [2025-10-08 15:48:49] -processing cell_group_name: Nascent Mesoderm.Nascent Mesoderm_s2, size: 14 INFO [2025-10-08 15:48:54] -processing cell_group_name: Primative Streak.Primative Streak_s1, size: 58 INFO [2025-10-08 15:49:00] -processing cell_group_name: Primative Streak.Primative Streak_s2, size: 8 INFO [2025-10-08 15:49:04] -processing cell_group_name: Epiblast.Epiblast_s1, size: 44 INFO [2025-10-08 15:49:09] -processing cell_group_name: Epiblast.Epiblast_s3, size: 19 INFO [2025-10-08 15:49:14] -processing cell_group_name: Epiblast.Epiblast_s2, size: 14 INFO [2025-10-08 15:49:19] -writing cell clusters file: output_dir/17_HMM_predHMMi3.leiden.hmm_mode-subclusters.cell_groupings INFO [2025-10-08 15:49:19] -writing cnv regions file: output_dir/17_HMM_predHMMi3.leiden.hmm_mode-subclusters.pred_cnv_regions.dat INFO [2025-10-08 15:49:19] -writing per-gene cnv report: output_dir/17_HMM_predHMMi3.leiden.hmm_mode-subclusters.pred_cnv_genes.dat INFO [2025-10-08 15:49:19] -writing gene ordering info: output_dir/17_HMM_predHMMi3.leiden.hmm_mode-subclusters.genes_used.dat INFO [2025-10-08 15:49:23] ::plot_cnv:Start INFO [2025-10-08 15:49:23] ::plot_cnv:Current data dimensions (r,c)=14007,575 Total=16053252 Min=1 Max=3. INFO [2025-10-08 15:49:23] ::plot_cnv:Depending on the size of the matrix this may take a moment. INFO [2025-10-08 15:49:23] plot_cnv_observation:Start INFO [2025-10-08 15:49:23] Observation data size: Cells= 498 Genes= 14007 INFO [2025-10-08 15:49:23] plot_cnv_observation:Writing observation groupings/color. INFO [2025-10-08 15:49:23] plot_cnv_observation:Done writing observation groupings/color. INFO [2025-10-08 15:49:23] plot_cnv_observation:Writing observation heatmap thresholds. INFO [2025-10-08 15:49:23] plot_cnv_observation:Done writing observation heatmap thresholds. INFO [2025-10-08 15:49:24] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000 INFO [2025-10-08 15:49:24] Quantiles of plotted data range: 1,2,2,2,3 INFO [2025-10-08 15:49:24] plot_cnv_references:Start INFO [2025-10-08 15:49:24] Reference data size: Cells= 77 Genes= 14007 INFO [2025-10-08 15:49:24] plot_cnv_references:Number reference groups= 1 INFO [2025-10-08 15:49:24] plot_cnv_references:Plotting heatmap. INFO [2025-10-08 15:49:24] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000 INFO [2025-10-08 15:49:24] Quantiles of plotted data range: 1,2,2,2,3 INFO [2025-10-08 15:49:25] STEP 18: Run Bayesian Network Model on HMM predicted CNVs INFO [2025-10-08 15:49:25] Creating the following Directory: output_dir/BayesNetOutput.HMMi3.leiden.hmm_mode-subclusters INFO [2025-10-08 15:49:25] Initializing new MCM InferCNV Object. INFO [2025-10-08 15:49:25] validating infercnv_obj INFO [2025-10-08 15:49:25] Total CNV's: 215 INFO [2025-10-08 15:49:25] Loading BUGS Model. INFO [2025-10-08 15:49:25] Running Sampling Using Parallel with 4 Cores INFO [2025-10-08 15:49:33] Obtaining probabilities post-sampling INFO [2025-10-08 15:49:35] Gibbs sampling time: 0.164916165669759 Minutes INFO [2025-10-08 15:49:44] ::plot_cnv:Start INFO [2025-10-08 15:49:44] ::plot_cnv:Current data dimensions (r,c)=14007,575 Total=81272.6424277677 Min=0 Max=0.941422250860327. INFO [2025-10-08 15:49:44] ::plot_cnv:Depending on the size of the matrix this may take a moment. INFO [2025-10-08 15:49:44] plot_cnv_observation:Start INFO [2025-10-08 15:49:44] Observation data size: Cells= 498 Genes= 14007 INFO [2025-10-08 15:49:44] plot_cnv_observation:Writing observation groupings/color. INFO [2025-10-08 15:49:44] plot_cnv_observation:Done writing observation groupings/color. INFO [2025-10-08 15:49:44] plot_cnv_observation:Writing observation heatmap thresholds. INFO [2025-10-08 15:49:44] plot_cnv_observation:Done writing observation heatmap thresholds. INFO [2025-10-08 15:49:45] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000 INFO [2025-10-08 15:49:45] Quantiles of plotted data range: 0,0,0,0,0.910900693025275 INFO [2025-10-08 15:49:45] plot_cnv_references:Start INFO [2025-10-08 15:49:45] Reference data size: Cells= 77 Genes= 14007 INFO [2025-10-08 15:49:45] plot_cnv_references:Number reference groups= 1 INFO [2025-10-08 15:49:45] plot_cnv_references:Plotting heatmap. INFO [2025-10-08 15:49:45] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000 INFO [2025-10-08 15:49:45] Quantiles of plotted data range: 0,0,0,0,0.941422250860327 INFO [2025-10-08 15:49:53] STEP 19: Filter HMM predicted CNVs based on the Bayesian Network Model results and BayesMaxPNormal INFO [2025-10-08 15:49:53] Attempting to removing CNV(s) with a probability of being normal above 0.5 INFO [2025-10-08 15:49:53] Removing 35 CNV(s) identified by the HMM. INFO [2025-10-08 15:49:53] Total CNV's after removing: 180 INFO [2025-10-08 15:49:53] Reassigning CNVs based on state probabilities. INFO [2025-10-08 15:49:53] Changing the following CNV's states assigned by the HMM to the following based on the CNV's state probabilities. chr3-region_244 : 1 (P= 0.427350981725549 ) -> 2 (P= 0.427648153209705 ) chr19-region_290 : 3 (P= 0.425837054407641 ) -> 2 (P= 0.429151149292175 ) chr6-region_335 : 1 (P= 0.3814386660175 ) -> 2 (P= 0.493360696737336 ) chr1-region_368 : 3 (P= 0.399579349466875 ) -> 2 (P= 0.401416806826951 ) chr3-region_372 : 3 (P= 0.399579349466875 ) -> 2 (P= 0.401416806826951 ) chr3-region_374 : 3 (P= 0.399579349466875 ) -> 2 (P= 0.401416806826951 ) chr11-region_400 : 3 (P= 0.394019122174418 ) -> 2 (P= 0.402868665981718 ) chr16-region_408 : 3 (P= 0.394019122174418 ) -> 2 (P= 0.402868665981718 ) chr16-region_410 : 3 (P= 0.394019122174418 ) -> 2 (P= 0.402868665981718 ) chr17-region_495 : 1 (P= 0.47082392211849 ) -> 2 (P= 0.472716392363994 ) chr2-region_662 : 3 (P= 0.452343005283811 ) -> 2 (P= 0.488833560093813 ) INFO [2025-10-08 15:49:53] Creating Plots for CNV and cell Probabilities. INFO [2025-10-08 15:50:24] ::plot_cnv:Start INFO [2025-10-08 15:50:24] ::plot_cnv:Current data dimensions (r,c)=14007,575 Total=66359.6676657521 Min=0 Max=0.941422250860327. INFO [2025-10-08 15:50:24] ::plot_cnv:Depending on the size of the matrix this may take a moment. INFO [2025-10-08 15:50:24] plot_cnv_observation:Start INFO [2025-10-08 15:50:24] Observation data size: Cells= 498 Genes= 14007 INFO [2025-10-08 15:50:24] plot_cnv_observation:Writing observation groupings/color. INFO [2025-10-08 15:50:24] plot_cnv_observation:Done writing observation groupings/color. INFO [2025-10-08 15:50:24] plot_cnv_observation:Writing observation heatmap thresholds. INFO [2025-10-08 15:50:24] plot_cnv_observation:Done writing observation heatmap thresholds. INFO [2025-10-08 15:50:25] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000 INFO [2025-10-08 15:50:25] Quantiles of plotted data range: 0,0,0,0,0.910900693025275 INFO [2025-10-08 15:50:25] plot_cnv_references:Start INFO [2025-10-08 15:50:25] Reference data size: Cells= 77 Genes= 14007 INFO [2025-10-08 15:50:25] plot_cnv_references:Number reference groups= 1 INFO [2025-10-08 15:50:25] plot_cnv_references:Plotting heatmap. INFO [2025-10-08 15:50:25] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000 INFO [2025-10-08 15:50:25] Quantiles of plotted data range: 0,0,0,0,0.941422250860327 INFO [2025-10-08 15:50:28] ::plot_cnv:Start INFO [2025-10-08 15:50:28] ::plot_cnv:Current data dimensions (r,c)=14007,575 Total=16046257 Min=1 Max=3. INFO [2025-10-08 15:50:29] ::plot_cnv:Depending on the size of the matrix this may take a moment. INFO [2025-10-08 15:50:29] plot_cnv_observation:Start INFO [2025-10-08 15:50:29] Observation data size: Cells= 498 Genes= 14007 INFO [2025-10-08 15:50:29] plot_cnv_observation:Writing observation groupings/color. INFO [2025-10-08 15:50:29] plot_cnv_observation:Done writing observation groupings/color. INFO [2025-10-08 15:50:29] plot_cnv_observation:Writing observation heatmap thresholds. INFO [2025-10-08 15:50:29] plot_cnv_observation:Done writing observation heatmap thresholds. INFO [2025-10-08 15:50:29] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000 INFO [2025-10-08 15:50:29] Quantiles of plotted data range: 1,2,2,2,3 INFO [2025-10-08 15:50:30] plot_cnv_references:Start INFO [2025-10-08 15:50:30] Reference data size: Cells= 77 Genes= 14007 INFO [2025-10-08 15:50:30] plot_cnv_references:Number reference groups= 1 INFO [2025-10-08 15:50:30] plot_cnv_references:Plotting heatmap. INFO [2025-10-08 15:50:30] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000 INFO [2025-10-08 15:50:30] Quantiles of plotted data range: 1,2,2,2,3 INFO [2025-10-08 15:50:30] STEP 20: Converting HMM-based CNV states to repr expr vals INFO [2025-10-08 15:50:33] ::plot_cnv:Start INFO [2025-10-08 15:50:33] ::plot_cnv:Current data dimensions (r,c)=14007,575 Total=8023128.5 Min=0.5 Max=1.5. INFO [2025-10-08 15:50:33] ::plot_cnv:Depending on the size of the matrix this may take a moment. INFO [2025-10-08 15:50:33] plot_cnv_observation:Start INFO [2025-10-08 15:50:33] Observation data size: Cells= 498 Genes= 14007 INFO [2025-10-08 15:50:33] plot_cnv_observation:Writing observation groupings/color. INFO [2025-10-08 15:50:33] plot_cnv_observation:Done writing observation groupings/color. INFO [2025-10-08 15:50:33] plot_cnv_observation:Writing observation heatmap thresholds. INFO [2025-10-08 15:50:33] plot_cnv_observation:Done writing observation heatmap thresholds. INFO [2025-10-08 15:50:34] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000 INFO [2025-10-08 15:50:34] Quantiles of plotted data range: 0.5,1,1,1,1.5 INFO [2025-10-08 15:50:34] plot_cnv_references:Start INFO [2025-10-08 15:50:34] Reference data size: Cells= 77 Genes= 14007 INFO [2025-10-08 15:50:35] plot_cnv_references:Number reference groups= 1 INFO [2025-10-08 15:50:35] plot_cnv_references:Plotting heatmap. INFO [2025-10-08 15:50:35] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000 INFO [2025-10-08 15:50:35] Quantiles of plotted data range: 0.5,1,1,1,1.5 INFO [2025-10-08 15:50:35] STEP 22: Denoising INFO [2025-10-08 15:50:35] ::process_data:Remove noise, noise threshold defined via ref mean sd_amplifier: 1.5 INFO [2025-10-08 15:50:35] denoising using mean(normal) +- sd_amplifier * sd(normal) per gene per cell across all data INFO [2025-10-08 15:50:35] :: **** clear_noise_via_ref_quantiles **** : removing noise between bounds: 0.989885865291573 - 1.01019101265148 INFO [2025-10-08 15:50:40] ## Making the final infercnv heatmap ## INFO [2025-10-08 15:50:41] ::plot_cnv:Start INFO [2025-10-08 15:50:41] ::plot_cnv:Current data dimensions (r,c)=14007,575 Total=8060977.94547191 Min=0.827673338207976 Max=1.15777732284435. INFO [2025-10-08 15:50:41] ::plot_cnv:Depending on the size of the matrix this may take a moment. INFO [2025-10-08 15:50:41] plot_cnv(): auto thresholding at: (0.966591 , 1.033409) INFO [2025-10-08 15:50:41] plot_cnv_observation:Start INFO [2025-10-08 15:50:41] Observation data size: Cells= 498 Genes= 14007 INFO [2025-10-08 15:50:41] plot_cnv_observation:Writing observation groupings/color. INFO [2025-10-08 15:50:41] plot_cnv_observation:Done writing observation groupings/color. INFO [2025-10-08 15:50:41] plot_cnv_observation:Writing observation heatmap thresholds. INFO [2025-10-08 15:50:41] plot_cnv_observation:Done writing observation heatmap thresholds. INFO [2025-10-08 15:50:42] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000 INFO [2025-10-08 15:50:42] Quantiles of plotted data range: 0.96659059046182,1.00003843897153,1.00003843897153,1.00003843897153,1.03340940953818 INFO [2025-10-08 15:50:42] plot_cnv_references:Start INFO [2025-10-08 15:50:42] Reference data size: Cells= 77 Genes= 14007 INFO [2025-10-08 15:50:42] plot_cnv_references:Number reference groups= 1 INFO [2025-10-08 15:50:42] plot_cnv_references:Plotting heatmap. INFO [2025-10-08 15:50:42] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000 INFO [2025-10-08 15:50:42] Quantiles of plotted data range: 0.96659059046182,1.00003843897153,1.00003843897153,1.00003843897153,1.03340940953818
Warning: Data is of class matrix. Coercing to dgCMatrix.
Finding variable features for layer counts
Calculating gene variances
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Calculating feature variances of standardized and clipped values
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Centering and scaling data matrix
|
| | 0%
|
|===== | 8%
|
|=========== | 15%
|
|================ | 23%
|
|====================== | 31%
|
|=========================== | 38%
|
|================================ | 46%
|
|====================================== | 54%
|
|=========================================== | 62%
|
|================================================ | 69%
|
|====================================================== | 77%
|
|=========================================================== | 85%
|
|================================================================= | 92%
|
|======================================================================| 100%
PC_ 1
Positive: CRELD2, ALG12, FTLP3, SMOX, PRNP, SLC1A6, OR7C1, RASSF2, ZNF333, SLC23A2
NDUFB7, TECR, TMEM230, DNAJB1, GIPC1, PCNA, PTGER1, CDS2, ATPAF2, LINC00654
PKN1, TOM1L2, GPCPD1, DDX39A, CHGB, TRMT6, SREBF1, MCM8, ASF1B, CRLS1
Negative: IL6ST, ITGA2, NNT, SPRTN, ARV1, PTK2B, BNIP3L, SDAD1P1, AC022431.1, MIR5687
GPX8, ARL15, PELO, ITGA1, ISL1, RBMS1, PLA2R1, LY75, CD302, BAZ2B
KCNK1, NTPCR, GNPAT, EPHX2, TRIM35, CDCA2, SETD9, MAP3K1, ANKRD55, FST
PC_ 2
Positive: GIT2, ANKRD13A, TCHP, C12orf76, GLTP, RBFOX2, APOL6, MB, FAM216A, RASD2
TCTN1, MCM5, HVCN1, PPP1CC, HMOX1, CCDC63, CUX2, TOM1, SH2B3, HMGXB4
PDE10A, NAA25, TRAFD1, HECTD4, PTPN11, RPH3A, C6orf118, TIMP3, OAS1, APPBP2
Negative: SPPL2B, OAZ1, SF3A2, WASH5P, LINC01002, PLEKHJ1, MIER2, DOT1L, SHC2, MADCAM1
AP3D1, TPGS1, CDC34, MOB3A, BSG, POLRMT, MKNK2, RNF126, FSTL3, BTBD2
PRSS57, PTBP1, CSNK1G2, AZU1, AC012615.1, CFD, MED16, ADAT3, R3HDM4, KISS1R
PC_ 3
Positive: AL080243.1, EP300, L3MBTL2, RANGAP1, ZC3H7B, DNAJB7, TEF, TOB2, PHF5A, ACO2
POLR3H, PMM1, DESI1, XRCC6, C22orf46, MEI1, CCDC134, SREBF2, KPTN, NAPA
MEIS3, TNFRSF13C, PLA2G4C, DHX34, LIG1, C5AR1, CCDC9, CENPM, SAE1, SYNGR2
Negative: RTN4RL2, SLC43A1, SLC43A3, TIMM10, PRG2, SMTNL1, P2RX3, SSRP1, UBE2L6, TNKS1BP1
SERPING1, APLNR, CLP1, OR5M2P, ZDHHC5, FOLH1, MED19, PTPRJ, TMX2, CTNND1
LPXN, ZFP91, GLYATL2, AP001652.1, FAM111B, MGLL, SEC61A1, ABTB1, RUVBL1, PODXL2
PC_ 4
Positive: CRTC3, BLM, FES, MAN2A2, UNC45A, HDDC3, RCCD1, ANKRD13A, C12orf76, PRC1
GIT2, PEX11A, TCHP, KIF7, VPS33B, TICRR, GLTP, POLG, FANCI, SLCO3A1
ABHD2, FAM216A, MFGE8, HAPLN3, TCTN1, ST8SIA2, ACAN, HVCN1, ISG20, PPP1CC
Negative: MGLL, ABTB1, PODXL2, SEC61A1, MCM2, RUVBL1, TPRA1, EEFSEC, PLXNA1, GATA2
RPN1, CHCHD6, RAB7A, ACAD9, TXNRD3, KIAA1257, EFCC1, GP9, RAB43, ISY1
CNBP, COPG1, HMCES, H1FX, RPL32P3, EFCAB12, MBD4, IFT122, PLXND1, TMCC1
PC_ 5
Positive: HIGD1A, CCDC13, ACKR2, KRBOX1, ZNF662, FAM198A, POMGNT2, SNRK, ANO10, ABHD5
TCAIM, ZNF445, ZNF852, ZKSCAN7, ZNF660, MPRIPP1, ZNF197, PDE10A, ZNF35, APPBP2
C6orf118, ZNF502, QKI, ZNF501, PPM1D, CAHM, BCAS3, KIAA1143, BRIP1, PACRG
Negative: DHDDS, LIN28A, CD52, UBXN11, SH3BGRL3, CEP85, CNKSR1, ZNF593, HMGN2, FAM110D
PDIK1L, AL391650.1, EXTL1, SCARNA18, PAFAH2, STMN1, PAQR7, AUNIP, RPS6KA1, AL020996.1
MTFR1L, MAN1C1, ARID1A, TEX10, MSANTD3, LDLRAP1, INVS, ERP44, TMEFF1, STX17
Computing nearest neighbor graph
Computing SNN
Warning: Data is of class matrix. Coercing to dgCMatrix.
Finding variable features for layer counts
Calculating gene variances
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Calculating feature variances of standardized and clipped values
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Centering and scaling data matrix
|
| | 0%
|
|===== | 8%
|
|=========== | 15%
|
|================ | 23%
|
|====================== | 31%
|
|=========================== | 38%
|
|================================ | 46%
|
|====================================== | 54%
|
|=========================================== | 62%
|
|================================================ | 69%
|
|====================================================== | 77%
|
|=========================================================== | 85%
|
|================================================================= | 92%
|
|======================================================================| 100%
PC_ 1
Positive: MYOM1, ERGIC2, AC009299.2, PLK2, PKP4, AGT, GPBP1, TANC1, CAPN9, MIER3
WDSUB1, P2RX4, C1orf198, WDR66, CAMKK2, KANSL2, PSMD9, ASB8, RBPMS, ERLIN2
ACAD8, FKBP4, ALG10, PCED1B, PFKM, ROCK1P1, TTC13, EGLN1, KCNK1, AC009506.1
Negative: PAQR8, EFHC1, MCM3, TRAM2, FTH1P5, TFAP2B, TMEM14A, RHAG, ZBTB43, ZBTB34
PPIL1, RALGPS1, C6orf141, GSTA2, ANGPTL2, GARNL3, SLC2A8, ZNF79, CENPQ, LRSAM1
STXBP1, PCMTD2, PTRH1, TTC16, MUT, TOR2A, SH2D3C, CDK9, FPGS, ENG
PC_ 2
Positive: PGPEP1L, TTC23, LRRC28, IGF1R, KIF7, PEX11A, AC022819.1, TICRR, ARRDC4, POLG
FES, BLM, MAN2A2, CRTC3, UNC45A, HDDC3, MEF2A, RCCD1, PRC1, LINC00923
VPS33B, SLCO3A1, LYSMD4, ST8SIA2, MCTP2, FAM174B, CHD2, AC090825.1, RGMA, ADAMTS17
Negative: YIPF4, BIRC6, NLRC4, TTC27, SLC30A6, LTBP1, SPAST, RASGRP3, MEMO1, FAM98A
AC073218.2, ZNF852, ZKSCAN7, ZNF445, ZNF660, TCAIM, MPRIPP1, ZNF197, ZNF35, CRIM1
ABHD5, KIAA1143, ZNF502, ZNF501, ANO10, FEZ2, SNRK, VIT, STRN, POMGNT2
PC_ 3
Positive: SCAND2P, ZSCAN2, LINC00933, EGLN1P1, WDR73, NMB, SEC11A, ALPK3, AC044860.1, AKAP13
KLHL25, NTRK3, MRPL46, MRPS11, DET1, AC013489.1, AEN, ISG20, ACAN, HAPLN3
MFGE8, ABHD2, FANCI, PGPEP1L, TTC23, IGF1R, LRRC28, CRTC3, BLM, FES
Negative: TMEM219, TAOK2, KCTD13, HIRIP3, ASPHD1, SEZ6L2, INO80E, CDIPT, DOC2A, MVP
PAGR1, ALDOA, PRRT2, PPP4C, MAZ, KIF22, TBX6, ZG16, QPRT, SPN
YPEL3, SLC7A5P1, SULT1A4, SLX1B, MAPK3, NPIPB11, ABTB1, MGLL, SNX29P2, PODXL2
PC_ 4
Positive: RPS15AP10, TMEM69, GPBP1L1, IPP, MAST2, PIK3R3, TSPAN1, POMGNT1, LURAP1, RAD54L
LRRC41, NSUN4, FAAH, DMBX1, MKNK1, MOB3C, ATPAF1, EFCAB14, CYP4B1, CYP4A11
CYP4X1, CYP4Z1, LINC00853, PDZK1IP1, TAL1, STIL, CMPK1, TRABD2B, SLC5A9, SPATA6
Negative: PREPL, CAMKMT, SRSF7, SIX3, SRBD1, PRKCE, EPAS1, ATP6V1E2, RHOQ, PIGF
GALM, CRIPT, SOCS5, AC016722.1, HNRNPLL, AC016722.2, MCFD2, RPLP0P6, TTC7A, ATL2
EPCAM, CYP1B1, MSH2, RMDN2, KCNK12, CDC42EP3, AC079250.1, QPCT, MSH6, PRKD3
PC_ 5
Positive: TBX15, WARS2, SPAG17, PHGDH, HMGCS2, WDR3, NOTCH2, GDAP2, FAM72B, HIST2H3DP1
MAN1A2, HIST2H2BA, TRIM45, SEC22B, PDE4DIP, NBPF10, NBPF9, PFN1P2, NBPF8, SRGAP2B
FAM72D, EMBP1, SRGAP2C, FCGR1B, TTF2, PTGFRN, AP4B1, PTPN22, DCLRE1B, HIPK1
Negative: GADD45GIP1, RAD23A, CALR, DAND5, FARSA, LYL1, TRMT1, SYCE2, NACC1, GCDH
XYLB, OXSR1, STX10, MYD88, IER2, ACAA1, KLF1, CACNA1A, DLEC1, CCDC130
PLCD1, MRI1, BAIAP2L2, DNASE2, ZSWIM4, PALM3, C19orf67, IL27RA, SAMD1, PRKACA
Computing nearest neighbor graph
Computing SNN
Warning: Data is of class matrix. Coercing to dgCMatrix.
Finding variable features for layer counts
Calculating gene variances
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Calculating feature variances of standardized and clipped values
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Centering and scaling data matrix
|
| | 0%
|
|===== | 8%
|
|=========== | 15%
|
|================ | 23%
|
|====================== | 31%
|
|=========================== | 38%
|
|================================ | 46%
|
|====================================== | 54%
|
|=========================================== | 62%
|
|================================================ | 69%
|
|====================================================== | 77%
|
|=========================================================== | 85%
|
|================================================================= | 92%
|
|======================================================================| 100%
PC_ 1
Positive: PGBD5, GALNT2, SECTM1, URB2, PART1, KANSL3, C12orf56, CD7, NUFIP1, TAF5L
DEPDC1B, XPOT, LMAN2L, LINC00330, PLA2G16, RND1, ABCB10, CSNK1D, AIMP1, ELOVL7
TSC22D1, TBK1, LGALS12, CNNM4, PAPSS1, NUP133, SERP2, GNL3LP1, CCDC65, HRASLS5
Negative: CTCF, ATP6V0D1, HSD11B2, ZDHHC1, TPPP3, LRRC36, PLEKHG4, SLC9A5, FHOD1, TMEM208
LRRC29, MRAS, NME9, C2CD4D, TDRKH, ARMC8, OAZ3, RAB13, MRPL9, RIIAD1
SNX27, DBR1, TUFT1, CGN, POGZ, DZIP1L, NCK1, FCGR1B, HIST2H2BA, HIST2H3DP1
PC_ 2
Positive: AIDA, MIA3, DISP1, TAF1A, NDUFB1P2, SNX24, PPIC, SNX2, SUSD4, AC106786.1
HHIPL2, SNCAIP, ZNF474, CAPN2, CEP120, DUSP10, LOX, TP53BP2, CSNK1G3, SRFBP1
HLX, AC138393.1, PRR16, ZNF608, FBXO28, MARC1, HSD17B4, DEGS1, ALDH7A1, MARC2
Negative: TXNRD3, CHCHD6, PLXNA1, TPRA1, MCM2, PODXL2, ABTB1, MGLL, SEC61A1, RUVBL1
EEFSEC, GATA2, RPN1, RAB7A, ACAD9, KIAA1257, EFCC1, GP9, RAB43, ISY1
CNBP, COPG1, HMCES, H1FX, RPL32P3, EFCAB12, MBD4, IFT122, PLXND1, TMCC1
PC_ 3
Positive: GALR3, ANKRD54, EIF3L, MICALL1, GCAT, C22orf23, H1F0, POLR2F, PICK1, TRIOBP
SLC16A8, BAIAP2L2, NOL12, PLA2G6, PDXP, MAFF, TMEM184B, Z83844.1, CSNK1E, SH3BP1
KDELR3, DDX17, GGA1, DMC1, FAM227A, LGALS2, SLCO3A1, VPS33B, ST8SIA2, PRC1
Negative: RTBDN, RNASEH2A, MAST1, PRDX2, AC020934.1, JUNB, DNASE2, HOOK2, KLF1, GCDH
ASNA1, SYCE2, TNPO2, FARSA, FBXW9, CALR, RAD23A, DHPS, GADD45GIP1, DAND5
WDR83, LYL1, TRMT1, MAN2B1, NACC1, STX10, ZNF791, IER2, CACNA1A, ZNF490
PC_ 4
Positive: NPTX1, ENDOV, RPTOR, RNF213, CHMP6, SLC26A11, BAIAP2, SGSH, EIF4A3, AATK
SLC38A10, ACTG1, NPLOC4, PDE6G, OXLD1, GAA, CCDC137, AC139530.1, ARL16, CCDC40
HGS, MRPL12, TBC1D16, SLC25A10, GCGR, CBX4, PPP1R27, CBX8, P4HB, ARHGDIA
Negative: NDUFB1P2, DISP1, AIDA, SUSD4, MIA3, CAPN2, TAF1A, TP53BP2, HHIPL2, AC138393.1
DUSP10, FBXO28, HLX, DEGS1, MARC1, NVL, MARC2, DNAH14, C1orf115, LBR
ENAH, MARK1, EPHX1, RAB3GAP2, TMEM63A, IARS2, LEFTY1, BPNT1, PYCR2, EPRS
PC_ 5
Positive: FAM117A, SLC35B1, KAT7, SPOP, TAC4, NGFR, PHB, DLX4, ZNF652, DLX3
ABI3, GNGT2, ITGA3, IGF2BP1, PDK2, SNF8, UBE2Z, PPP1R9B, CALCOCO2, TTLL6
HOXB13, SGCA, HOXB9, HOXB8, HOXB7, HOXB6, HOXB5, STPG1, NIPAL3, GRHL3
Negative: C22orf23, MICALL1, POLR2F, EIF3L, PICK1, ANKRD54, SLC16A8, GALR3, BAIAP2L2, PLA2G6
GCAT, MAFF, TMEM184B, H1F0, CSNK1E, KDELR3, TRIOBP, DDX17, DMC1, NOL12
FAM227A, PDXP, CBY1, TOMM22, Z83844.1, JOSD1, SH3BP1, GTPBP1, AL021707.2, GGA1
Computing nearest neighbor graph
Computing SNN
Warning: Data is of class matrix. Coercing to dgCMatrix.
Finding variable features for layer counts
Calculating gene variances
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Calculating feature variances of standardized and clipped values
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Centering and scaling data matrix
|
| | 0%
|
|===== | 8%
|
|=========== | 15%
|
|================ | 23%
|
|====================== | 31%
|
|=========================== | 38%
|
|================================ | 46%
|
|====================================== | 54%
|
|=========================================== | 62%
|
|================================================ | 69%
|
|====================================================== | 77%
|
|=========================================================== | 85%
|
|================================================================= | 92%
|
|======================================================================| 100%
PC_ 1
Positive: TNFSF11, AKAP11, AIMP1, TBCK, GSTCD, INTS12, ARHGEF38, EEF1A1P9, PPA2, TET2
CXXC4, TACR3, CENPE, NOVA1, BDH2, PRKD1, SLC9B2, GNAI3, RTN4IP1, SLC9B1
G2E3, GPR61, QRSL1, VAMP4, CISD2, AMIGO1, SCFD1, PRRC2C, C6orf203, CYB561D1
Negative: ODCP, TNPO3, AC025594.1, IRF5, ATP6V1F, KCP, FLNC, CCDC136, CALU, FAM71F2
METTL2B, HILPDA, IMPDH1, RBM28, AC018635.1, LRRC4, SND1, ARF5, GCC1, ZNF800
GRM8, POT1, GPR37, COL6A4P2, PIK3R4, FAM86HP, ATP2C1, ASTE1, ALG1L2, NEK11
PC_ 2
Positive: PSMC4, FCGBP, FBL, ZNF546, DYRK1B, EID2, ZNF780B, EID2B, TDGF1P7, DLL3
TIMM50, ZNF780A, SUPT5H, CNTD2, PLEKHG2, AKT2, ZFP36, C19orf47, MED29, PLD3
PRX, PAF1, SERTAD1, SAMD4B, SERTAD3, BLVRB, GMFG, SPTBN4, SHKBP1, LRFN1
Negative: STX7, VNN1, MOXD1, VNN3, CTGF, SLC18B1, ENPP1, TCF21, MED23, TBPL1
ARG1, SLC2A12, EPB41L2, SGK1, SMLR1, HBS1L, TMEM200A, SAMD3, MYB, L3MBTL3
AHI1, ARHGAP18, LINC00271, PTPRK, PDE7B, ECHDC1, RNF146, SOGA3, RSPO3, KIAA0408
PC_ 3
Positive: MAP3K5, MAP7, PEX7, SLC35D3, FEZ2, BCLAF1, MTFR2, VIT, PDE7B, IFNGR1
STRN, LINC00271, HEATR5B, AL357060.1, AHI1, GPATCH11, EIF2AK2, TNFAIP3, MYB, CEBPZ
NDUFAF7, PERP, PRKD3, HBS1L, QPCT, CDC42EP3, RMDN2, CYP1B1, ATL2, RPLP0P6
Negative: IARS2, BPNT1, RAB3GAP2, EPRS, MARK1, RIMKLBP2, C1orf115, LYPLAL1, MARC2, TGFB2
MARC1, RRP15, SPATA17, HLX, GPATCH2, DUSP10, ESRRG, KCTD3, HHIPL2, KCNK2
CENPF, TAF1A, PTPN14, SMYD2, PROX1, RPS6KC1, MIA3, ANGEL2, VASH2, FLVCR1
PC_ 4
Positive: AP1B1, RFPL1S, RASL10A, RFPL1, DLG2, TMEM126B, CCDC90B, CREBZF, ANKRD42, EED
PICALM, CCDC81, SYTL2, ME3, PRSS23, GAS2L1, PCF11, OR7E13P, FZD4, TMEM135
RAB30, RAB38, NEFH, PRCP, EWSR1, FAM181B, TENM4, THOC5, RHBDD3, NARS2
Negative: FKBP6, BAZ1B, NSUN5, BCL7B, NCF1B, TBL2, STAG3L3, MLXIPL, VPS37D, NSUN5P2
DNAJC30, POM121, STX1A, AC016909.2, NCKAP5, RN7SL625P, MGAT5, ABHD11, LYPD1, TMEM163
ZNF806, CCNT2, SBDSP1, CLDN3, AC097532.2, MAP3K19, ANKRD30BL, CLDN4, RAB3GAP1, RN7SL377P
PC_ 5
Positive: MYB, HBS1L, AHI1, SGK1, LINC00271, SLC2A12, PDE7B, MTFR2, TBPL1, BCLAF1
TCF21, MAP7, MAP3K5, SLC18B1, PEX7, VNN3, TNFAIP3, AL357060.1, PERP, IFNGR1
MARCKSL1P2, SLC35D3, HEBP2, CCDC28A, REPS1, VNN1, ABRACL, HECA, STX7, MOXD1
Negative: APOL2, RBFOX2, MYH9, APOL6, TXN2, FOXRED2, EIF3D, CACNG2, IFT27, NCF4
TST, MPST, KCTD17, HORMAD2, IL2RB, AC003681.1, C1QTNF6, MTMR3, ASCC2, RAC2
ZMAT5, FAM92B, KIAA0513, GSE1, ZDHHC7, GINS2, CRISPLD2, C16orf74, USP10, EMC8
Computing nearest neighbor graph
Computing SNN
Warning: Data is of class matrix. Coercing to dgCMatrix.
Finding variable features for layer counts
Calculating gene variances
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Calculating feature variances of standardized and clipped values
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Centering and scaling data matrix
|
| | 0%
|
|===== | 8%
|
|=========== | 15%
|
|================ | 23%
|
|====================== | 31%
|
|=========================== | 38%
|
|================================ | 46%
|
|====================================== | 54%
|
|=========================================== | 62%
|
|================================================ | 69%
|
|====================================================== | 77%
|
|=========================================================== | 85%
|
|================================================================= | 92%
|
|======================================================================| 100%
PC_ 1
Positive: ZNF737, ZNF826P, ZNF486, ZNF90, AC006539.1, ZNF682, UNC50, ZNF326, ZNF93, AC011477.1
MGAT4A, ZNF644, ZNF253, ZNF506, YWHAQP5, HFM1, LINC00663, ZNF14, CDC7, KIAA1211L
ZNF101, RTN4IP1, ATP13A1, TSGA10, EPHX4, QRSL1, GMIP, C2orf15, MBLAC1, LAMTOR4
Negative: PCOLCE2, TRPC1, PAQR9, PLS1, U2SURP, ATR, CHST2, XRN1, SLC9A9, PRR23C
GK5, COPB2, CLSTN2, MRPS22, C3orf58, ACTG1P1, SLC25A36, TFDP2, PIK3CB, AC107021.1
SPSB4, ATP1B3, RNF7, AC112504.1, RASA2, FAIM, ZBTB38, PLOD2, CEP70, PLSCR1
PC_ 2
Positive: MRGBP, COL9A3, TCFL5, DIDO1, GID8, SLC17A9, YTHDF1, NKAIN4, ARFGAP1, COL20A1
KCNQ2, EEF1A2, PPDPF, PTK6, HELZ2, GMEB2, STMN3, RTEL1, ARFRP1, ZGPAT
INVS, LIME1, ERP44, SLC2A4RG, STX17, ZBTB46, NR4A3, ALG2, TPD52L2, TGFBR1
Negative: POLG, TICRR, FES, MAN2A2, BLM, UNC45A, EFCC1, GP9, CRTC3, RAB43
KIF7, KIAA1257, ISY1, ACAD9, HDDC3, CNBP, RAB7A, PEX11A, COPG1, FANCI
HMCES, RCCD1, RPN1, H1FX, RPL32P3, GATA2, PRC1, ABHD2, EFCAB12, EEFSEC
PC_ 3
Positive: TIMP4, PPARG, SYN2, TSEN2, TAMM41, MKRN2, VGLL4, RAF1, TMEM40, ATG7
CAND2, HRH1, IQSEC1, SLC6A11, NUP210, SEC13, HDAC11, GHRL, FBLN2, TATDN2
LINC00620, IRAK2, CHCHD4, VHL, TMEM43, XPC, SLC6A6, GRIP2, CCDC174, FGD5
Negative: HEBP2, MARCKSL1P2, CCDC28A, PERP, TNFAIP3, REPS1, AL357060.1, ABRACL, IFNGR1, HECA
SLC35D3, CITED2, PEX7, VTA1, MAP3K5, MVP, PAGR1, CDIPT, MAP7, PRRT2
AIG1, SEZ6L2, MAZ, KIF22, BCLAF1, ASPHD1, ZG16, SLC18B1, TCF21, MTFR2
PC_ 4
Positive: ASCC2, ZMAT5, MTMR3, CABP7, AC003681.1, NF2, HORMAD2, NIPSNAP1, TBC1D10A, THOC5
SF3A1, CCDC157, NEFH, SEC14L2, MTFP1, RFPL1, SEC14L6, RFPL1S, GAL3ST1, AP1B1
PES1, TCN2, SLC35E4, RASL10A, DUSP18, MORC2, TUG1, GAS2L1, SMTN, INPP5J
Negative: CGB5, NTF4, CGB2, KCNA7, RUVBL2, SNRNP70, GYS1, LIN7B, BAX, C19orf73
DHDH, PPFIA3, NUCB1, HRC, TULP2, PPP1R15A, TRPM4, PLEKHA4, CD37, HSD17B14
TEAD2, BCAT2, RASIP1, DKKL1, ACAD9, KIAA1257, RAB7A, EFCC1, GP9, RPN1
PC_ 5
Positive: EIF2AK2, CEBPZ, NDUFAF7, GPATCH11, PRKD3, HEATR5B, QPCT, STRN, CDC42EP3, VIT
RMDN2, CYP1B1, FEZ2, ATL2, RPLP0P6, HNRNPLL, CRIM1, GALM, SRSF7, GEMIN6
AC073218.2, SLC3A1, PREPL, CAMKMT, SIX3, SRBD1, PRKCE, EPAS1, ATP6V1E2, RHOQ
Negative: RUVBL1, SEC61A1, EEFSEC, MGLL, GATA2, ABTB1, RPN1, RAB7A, ACAD9, KIAA1257
EFCC1, GP9, RAB43, ISY1, CNBP, COPG1, HMCES, H1FX, RPL32P3, EFCAB12
MBD4, IFT122, PLXND1, NF2, NIPSNAP1, TMCC1, THOC5, CABP7, AC083799.1, NEFH
Computing nearest neighbor graph
Computing SNN
Warning: Data is of class matrix. Coercing to dgCMatrix.
Finding variable features for layer counts
Calculating gene variances
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Calculating feature variances of standardized and clipped values
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Centering and scaling data matrix
|
| | 0%
|
|===== | 8%
|
|=========== | 15%
|
|================ | 23%
|
|====================== | 31%
|
|=========================== | 38%
|
|================================ | 46%
|
|====================================== | 54%
|
|=========================================== | 62%
|
|================================================ | 69%
|
|====================================================== | 77%
|
|=========================================================== | 85%
|
|================================================================= | 92%
|
|======================================================================| 100%
PC_ 1
Positive: SLC38A9, DNM1L, ARV1, SPRTN, DISC1, CD302, ARL15, KCTD9, FBXO16, KIF13B
AC084262.1, DUSP4, TUBBP1, BRF2, GLB1L3, KDM5A, ERC1, WNT5B, FBXL14, ADIPOR2
DYRK4, AC024940.1, BICD1, SYT10, CNTN1, EXOC8, TSNAX, NTPCR, AC009506.1, LY75
Negative: TMEM169, XRCC5, PECR, SMARCAL1, MREG, LINC00607, IGFBP2, FN1, ATIC, IGFBP5
ABCA12, CLPSL1, ARMC12, BARD1, LHFPL5, FKBP5, SRPK1, SPAG16, ARPC2, TULP1
IKZF2, MAPK14, TEAD3, AC079610.1, MAPK13, ERBB4, FANCE, LANCL1, BRPF3, AAMP
PC_ 2
Positive: SLC35G2, STAG1, NCK1, PCNA, PCCB, DZIP1L, TMEM230, MSL2, DBR1, SLC23A2
PPP2R3A, RASSF2, ARMC8, EPHB1, PRNP, SMOX, NME9, CEP63, FTLP3, ANAPC13
MRAS, AMOTL2, ESYT3, RYK, CEP70, CLSTN2, SLCO2A1, SLC25A36, FAIM, RAB6B
Negative: PSMC1P1, FAM19A4, EOGT, TMF1, UBA3, ARL6IP5, FRMD4B, MITF, FOXP1, EIF4E3
GPR27, GBE1, CADM2, ROBO1, CHMP2B, AC026877.1, CGGBP1, ROBO2, PROK2, ZNF717
LINC00960, FRG2C, RYBP, ZNF654, RARRES2P1, SHQ1, C3orf38, PPP4R2, AC133041.1, EBLN2
PC_ 3
Positive: PALM3, LDLRAP1, ZNRF2P2, IL27RA, WIPF3, MAN1C1, SCRN1, MTFR1L, RFX1, FKBP14
AL020996.1, PLEKHA8, AUNIP, ZNRF2, DCAF15, PAQR7, NOD1, GGCT, STMN1, GARS
PODNL1, PAFAH2, GHRHR, ADCYAP1R1, SCARNA18, CC2D1A, EXTL1, PPP1R17, AL391650.1, PDIK1L
Negative: FAM98A, AC073218.2, RASGRP3, CRIM1, LTBP1, FEZ2, TTC27, VIT, STRN, HEATR5B
MPRIP, RN7SL775P, FLCN, GPATCH11, COPS3, NT5M, MED9, EIF2AK2, RASD1, PEMT
RAI1, CEBPZ, SREBF1, TOM1L2, NDUFAF7, ATPAF2, PRKD3, GID4, DRG2, QPCT
PC_ 4
Positive: TMEM109, PRPF19, CCDC86, MS4A7, MS4A4A, MS4A6A, MS4A3, MRPL16, STX3, AP000640.2
PATL1, OSBP, OR5A2, OR5AN1, CDK12, MPEG1, DTX4, MED1, FAM111A, FAM111B
FBXL20, AP001652.1, CACNB1, GLYATL2, ARL5C, ZFP91, PLXDC1, LINC00672, LPXN, LASP1
Negative: CDC37, TYK2, ICAM3, RAVER1, ZGLP1, ICAM4, ICAM1, ELOF1, MRPL4, S1PR2
DNMT1, EIF3G, DOC2B, P2RY11, ZNF846, RPH3AL, C17orf97, FBXL12, VPS53, PPAN
FAM57A, GEMIN4, RPL10P15, DBIL5P, C19orf66, ZNF627, PIN1, GLOD4, C3P1, OLFM2
PC_ 5
Positive: DOC2B, RPH3AL, C17orf97, VPS53, FAM57A, GEMIN4, DBIL5P, GLOD4, NXN, TIMM22
ABR, YWHAE, CRK, MYO1C, INPP5K, PITPNA, SCARF1, RILP, PRPF8, TLCD2
ZNF878, ZNF844, ZNF20, ZNF625, ZNF433, ZNF136, ZNF44, ZNF563, ZNF763, ELOF1
Negative: GAL3ST1, SEC14L6, MTFP1, SEC14L2, CCDC157, SF3A1, TBC1D10A, HORMAD2, AC003681.1, MTMR3
TOM1, ASCC2, ZMAT5, HMOX1, CABP7, NF2, NT5M, MED9, RASD1, COPS3
PEMT, RAI1, MYO15A, DRG2, FLCN, GID4, SREBF1, ATPAF2, ALKBH5, FLII
Computing nearest neighbor graph
Computing SNN
Warning: Data is of class matrix. Coercing to dgCMatrix.
Finding variable features for layer counts
Calculating gene variances
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Calculating feature variances of standardized and clipped values
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Centering and scaling data matrix
|
| | 0%
|
|===== | 8%
|
|=========== | 15%
|
|================ | 23%
|
|====================== | 31%
|
|=========================== | 38%
|
|================================ | 46%
|
|====================================== | 54%
|
|=========================================== | 62%
|
|================================================ | 69%
|
|====================================================== | 77%
|
|=========================================================== | 85%
|
|================================================================= | 92%
|
|======================================================================| 100%
PC_ 1
Positive: STC1, CENPBD1, SLC25A37, DEF8, ENTPD4, TUBB3, LOXL2, TCF25, DCAF11, PCK2
R3HCC1, SPIRE2, SNX16, NRL, AC009237.14, FANCA, RALYL, CHMP7, FAHD2A, NNAT
DHRS4L2, GPR137, ZNF276, LRRCC1, PROM2, DHRS4, TNFRSF10A, CTNNBL1, VPS9D1, ZNF2
Negative: TLK2P1, ASIC2, TMEM98, MYO1D, CCL2, CDK5R1, TMEM132E, PSMD11, ZNF207, C17orf75
RHBDL3, RHOT1, AC090616.2, RALGAPA2, RN7SL690P, CRNKL1, NAA20, RIN2, AL121761.1, AL049647.1
SLC24A3, DTD1, SEC23B, RBBP9, POLR3F, DZANK1, PARD6G, ADNP2, RBFADN, ZNF133
PC_ 2
Positive: AC044860.1, AKAP13, KLHL25, NTRK3, MRPL46, MRPS11, DET1, AC013489.1, AEN, ISG20
ACAN, HAPLN3, CRELD2, MFGE8, ALG12, ABHD2, FANCI, ZBED4, POLG, BRD1
TICRR, KIF7, FAM19A5, PEX11A, TBC1D22A, CERK, GRAMD4, CELSR1, TRMU, CRTC3
Negative: AC016722.2, MCFD2, AC016722.1, SOCS5, TTC7A, CRIPT, EPCAM, PIGF, MSH2, RHOQ
KCNK12, ATP6V1E2, EPAS1, AC079250.1, PRKCE, MSH6, SRBD1, FBXO11, SIX3, CAMKMT
AC079807.1, PREPL, SLC3A1, FOXN2, PPP1R21, STON1, LHCGR, FSHR, NRXN1, GEMIN6
PC_ 3
Positive: PEX11A, KIF7, TICRR, ABHD2, FANCI, MFGE8, POLG, HAPLN3, ACAN, ISG20
AEN, AC013489.1, DET1, MRPS11, MRPL46, CRTC3, BLM, NTRK3, FES, MAN2A2
KLHL25, UNC45A, HDDC3, AKAP13, RCCD1, PRC1, AC044860.1, VPS33B, SLCO3A1, ST8SIA2
Negative: MS4A4A, MS4A7, MS4A6A, CCDC86, MS4A3, PRPF19, MRPL16, TMEM109, STX3, TMEM132A
AP000640.2, PATL1, SLC15A3, OSBP, CD6, OR5A2, OR5AN1, VPS37C, AP001652.1, FAM111B
GLYATL2, FAM111A, DTX4, ZFP91, MPEG1, LPXN, CTNND1, PGA3, TMX2, MED19
PC_ 4
Positive: MRPL15, SOX17, LYPLA1, XKR4, TMEM68, TCEA1, TGS1, RGS20, LYN, ATP6V1H
RN7SL798P, PLAG1, OPRK1, CHCHD7, RB1CC1, SDR16C5, PCMTD1, IMPAD1, FAM110B, EFCAB1
UBXN2B, SDCBP, NSMAF, TOX, CA8, RAB2A, CHD7, AC022182.2, ASPH, GGH
Negative: PTBP1, AZU1, PRSS57, CFD, FSTL3, MED16, RNF126, R3HDM4, POLRMT, KISS1R
BSG, ARID3A, WDR18, CDC34, TMEM259, TPGS1, CNN2, ABCA7, MADCAM1, POLR2E
GPX4, SHC2, SBNO2, STK11, MIER2, MIDN, ABHD17A, SCAMP4, CIRBP, KLF16
PC_ 5
Positive: SEC14L6, GAL3ST1, PES1, INPP5J, SMTN, TUG1, MORC2, DUSP18, SLC35E4, TCN2
PRR14L, DEPDC5, YWHAH, C22orf42, RTCB, FBXO7, SYN3, TIMP3, HMGXB4, TOM1
HMOX1, AC138035.1, TRIM52, ASTE1, NEK11, ATP2C1, MCM5, TRIM41, NUDT16, PIK3R4
Negative: CNTN3, PDZRN3, FAM86DP, EBLN2, AC133041.1, PPP4R2, RARRES2P1, SHQ1, FRG2C, LINC00960
ZNF717, RYBP, ROBO2, AC026877.1, PROK2, ROBO1, C3orf38, GBE1, GPR27, CADM2
ZNF654, CHMP2B, CGGBP1, EIF4E3, FOXP1, MITF, FRMD4B, ARL6IP5, UBA3, TMF1
Computing nearest neighbor graph
Computing SNN
Warning: Data is of class matrix. Coercing to dgCMatrix.
Finding variable features for layer counts
Calculating gene variances
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Calculating feature variances of standardized and clipped values
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Centering and scaling data matrix
|
| | 0%
|
|===== | 8%
|
|=========== | 15%
|
|================ | 23%
|
|====================== | 31%
|
|=========================== | 38%
|
|================================ | 46%
|
|====================================== | 54%
|
|=========================================== | 62%
|
|================================================ | 69%
|
|====================================================== | 77%
|
|=========================================================== | 85%
|
|================================================================= | 92%
|
|======================================================================| 100%
PC_ 1
Positive: ASH2L, STAR, MYOM1, VWF, ANO2, DPP4, AC013731.1, NTF3, TBR1, CCDC148
KCNA5, ZNF540, PSMD14, VPS26B, GALNT8, AC009299.3, PKP4, NDUFA9, TANC1, AC009299.2
AKAP3, WDSUB1, UNC5D, DCP1B, TMTC1, DDX11, RHOF, EGLN1, SLC35F3, MARCH7
Negative: AC025594.1, ATP6V1F, AC018635.1, LRRC4, PODNL1, CC2D1A, SND1, DCAF15, C19orf57, RFX1
RN7SL619P, ARF5, IL27RA, ZSWIM4, PALM3, MRI1, C19orf67, GCC1, SAMD1, CCDC130
PRKACA, ASF1B, ZNF800, KIAA1143, CACNA1A, ZNF502, ZNF501, ZNF35, DDX39A, ZNF197
PC_ 2
Positive: AC009060.1, PDPR, CLEC18A, EXOSC6, WWP2, AARS, NOB1, DDX19B, NQO1, NFAT5
DDX19A, CYB5B, ST3GAL2, TERF2, NIP7, FUK, PDF, COG8, COG4, VPS4A
SF3B3, SNTB2, MTSS1L, VAC14, HYDIN, CMTR2, CHTF8, CALB2, ZNF23, ZNF19
Negative: FAM174B, LRRC36, ST8SIA2, PLEKHG4, TPPP3, SLC9A5, FHOD1, SLCO3A1, TMEM208, ZDHHC1
LRRC29, VPS33B, PRC1, HSD11B2, RCCD1, HDDC3, ATP6V0D1, UNC45A, MAN2A2, FES
CTCF, BLM, CRTC3, ACD, PARD6A, PEX11A, ENKD1, CIAPIN1, KIF7, TICRR
PC_ 3
Positive: CARTPT, MAP1B, MCCC2, BDP1, GFOD2, C16orf86, ENKD1, FNBP1P1, MOB1A, TET3
MTHFD2, DGUOK, SLC4A5, STAMBP, DCTN1, DUSP11, C2orf81, ALMS1, AC074008.1, EGR4
PARD6A, FBXO41, CCT7, ACD, PRADC1, SMYD5, CTCF, NOTO, DDX28, RAB11FIP5
Negative: STX11, SF3B5, PLAGL1, LTV1, PHACTR2, FUCA2, VDAC1P8, RPTOR, CHMP6, NPTX1
BAIAP2, ENDOV, AATK, RNF213, SLC38A10, SLC26A11, PEX3, SGSH, ACTG1, EIF4A3
NPLOC4, GAA, ADAT2, PDE6G, CCDC40, SMLR1, AIG1, EPB41L2, TBC1D16, VTA1
PC_ 4
Positive: CARTPT, MAP1B, MCCC2, BDP1, PPARD, FANCE, TEAD3, FNBP1P1, TET3, MOB1A
DGUOK, STAMBP, MTHFD2, DUSP11, BRPF3, SLC4A5, PNPLA1, TULP1, DCTN1, MAPK13
C2orf81, ALMS1, MAPK14, AC074008.1, SRPK1, FKBP5, KCTD20, LHFPL5, ARMC12, CLPSL1
Negative: TIMP3, HMGXB4, SYN3, TOM1, FBXO7, HMOX1, RTCB, MCM5, C22orf42, RASD2
YWHAH, DEPDC5, MB, PRR14L, APOL6, RBFOX2, APOL2, MYH9, TXN2, FOXRED2
EIF3D, INPP5J, ASCC2, SMTN, ZMAT5, MTMR3, CACNG2, TUG1, CABP7, AC003681.1
PC_ 5
Positive: RN7SL605P, CLUH, PAFAH1B1, RAP1GAP2, METTL16, SGSM2, TSR1, OR3A2, SRR, SMG6
OVCA2, ASPA, DPH1, RPA1, SMYD4, TRPV1, SERPINF1, MIR22HG, TLCD2, SHPK
PRPF8, RILP, CTNS, TAX1BP3, EMC6, P2RX5, ITGAE, CAMKK1, P2RX1, WASH5P
Negative: RASL10A, GAS2L1, AP1B1, EWSR1, RFPL1S, RHBDD3, RFPL1, NEFH, EMID1, THOC5
KREMEN1, NIPSNAP1, ZNRF3, NF2, XBP1, CABP7, PCMTD2, CCDC117, MYT1, OPRL1
ZMAT5, HSCB, RGS19, TCEA2, CHEK2, PRPF6, ASCC2, TTC28, SAMD10, ZNF512B
Computing nearest neighbor graph
Computing SNN
Warning: Data is of class matrix. Coercing to dgCMatrix.
Finding variable features for layer counts
Calculating gene variances
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Calculating feature variances of standardized and clipped values
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Centering and scaling data matrix
|
| | 0%
|
|===== | 8%
|
|=========== | 15%
|
|================ | 23%
|
|====================== | 31%
|
|=========================== | 38%
|
|================================ | 46%
|
|====================================== | 54%
|
|=========================================== | 62%
|
|================================================ | 69%
|
|====================================================== | 77%
|
|=========================================================== | 85%
|
|================================================================= | 92%
|
|======================================================================| 100%
PC_ 1
Positive: CUL4A, LAMP1, PCID2, RDH10, STAU2, TERF1, F10, UBE2W, GRTP1, TRPA1
TMEM70, EYA1, LY96, F7, ADPRHL1, XKR9, JPH1, LACTB2, GDAP1, MCF2L
CRISPLD1, TRAM1, HNF4G, ATP11A, NCOA2, FAM83F, ZFHX4, GPS1, DUS1L, GRAP2
Negative: TG, PHF20L1, TMEM71, LRRC6, HHLA1, EFR3A, B4GALT5, ADCY8, SLC9A8, CCL25
SPATA2, GRINA, ELAVL1, LRRC7, SLC2A10, LRRC40, PTGER3, SRSF11, CTH, HHLA3
TP53RK, ANKRD13C, RNF114, TIMM44, ZRANB2, OPLAH, SLC13A3, WFDC8, NEGR1, WFDC2
PC_ 2
Positive: GPR37, TMEM229A, POT1, WASL, GRM8, IQUB, RPS26P31, ZNF800, CADPS2, GCC1
ARF5, SND1, LRRC4, AC018635.1, RBM28, IMPDH1, HILPDA, METTL2B, FAM71F2, CALU
CCDC136, FLNC, KCP, ATP6V1F, AC025594.1, IRF5, TNPO3, ODCP, TSPAN33, SMO
Negative: USP33, TONSL, NEXN, ESCO1, TIE1, C1orf210, POC1B, DUSP6, MPL, GALNT4
KCTD1, KITLG, TP53INP1, ZNF362, ATP2B1, TMTC3, BTG1, CYHR1, TMEM125, CEP290
CDC20, EEA1, DSTYK, C12orf29, NUDT4, GREB1L, AQP4, UBE2N, FUBP1, MGAT4C
PC_ 3
Positive: AEN, PGPEP1L, ISG20, PEX11A, ACAN, CRTC3, HAPLN3, KIF7, BLM, MFGE8
IGF1R, ABHD2, TICRR, FES, FANCI, MAN2A2, POLG, UNC45A, ARRDC4, HDDC3
RCCD1, PRC1, LINC00923, VPS33B, SLCO3A1, ST8SIA2, MCTP2, FAM174B, CHD2, RGMA
Negative: RUNDC1, PTGES3L, AARSD1, G6PC, G6PC3, LINC00671, HDAC5, AOC2, C17orf53, PSME3
ASB16, BECN1, LRFN1, GMFG, TMUB2, SAMD4B, PAF1, MED29, ZFP36, ATXN7L3
PLEKHG2, SUPT5H, TIMM50, SERTAD3, SERTAD1, BLVRB, PRX, SPTBN4, PLD3, UBTF
PC_ 4
Positive: TFF2, TMPRSS3, CD14, RSPH1, SLC35A4, SLC37A1, APBB3, PDE9A, WDR4, EIF4EBP3
NDUFV3, PKNOX1, SRA1, CBS, U2AF1, SIK1, EXTL2, ANKHD1, SLC30A7, HSF2BP
DPH5, AC093157.1, H2BFS, HBEGF, S1PR1, RRP1B, RNPC3, AMY2B, PDXK, PRMT6
Negative: UBTF, AC003102.1, ATXN7L3, RUNDC3A, TMUB2, SLC25A39, ASB16, GRN, C17orf53, FAM171A2
HDAC5, ITGA2B, G6PC3, PTPRB, LGR5, KCNMB4, ZFC3H1, CNOT2, THAP2, MYRFL
TMEM19, RAB3IP, BEST3, LRRC10, MYF6, LIN7A, PPP1R12A, ACSS3, PAWR, SYT1
PC_ 5
Positive: KCTD3, ESRRG, GPATCH2, SPATA17, RRP15, TGFB2, LYPLAL1, RIMKLBP2, EPRS, BPNT1
IARS2, RAB3GAP2, MARK1, C1orf115, MARC2, MARC1, HLX, DUSP10, EWSR1, RHBDD3
EMID1, EFNA2, MUM1, C19orf24, CIRBP, MIDN, STK11, SBNO2, GPX4, NDUFS7
Negative: TFF2, TMPRSS3, RSPH1, CD14, SLC37A1, SLC35A4, PDE9A, APBB3, WDR4, NDUFV3
EIF4EBP3, PKNOX1, CBS, SRA1, U2AF1, SIK1, ANKHD1, HSF2BP, EXTL2, SLC30A7
H2BFS, DPH5, AC093157.1, RRP1B, HBEGF, S1PR1, PDXK, RNPC3, AMY2B, CSTB
Computing nearest neighbor graph
Computing SNN
mean_delta: 0.0144000477724848, at sigma: 0.008754607423138, and pval: 0.05
KS_delta: 0.00598709803646501, at sigma: 0.008754607423138, and pval: 0.05
In addition: Warning messages:
1: In log2xplus1(infercnv_obj) : NaNs produced
2: package ‘future’ was built under R version 4.4.1
3: `aes_string()` was deprecated in ggplot2 3.0.0.
ℹ Please use tidy evaluation idioms with `aes()`.
ℹ See also `vignette("ggplot2-in-packages")` for more information.
ℹ The deprecated feature was likely used in the infercnv package.
Please report the issue at
<https://github.com/broadinstitute/inferCNV/issues>.
This warning is displayed once every 8 hours.
Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
generated.
%%R
plot_cnv(
infercnv_obj_run,
out_dir = "output_dir",
output_filename = "infercnv_plott",
png_res = 300)
INFO [2025-10-08 15:50:42] ::plot_cnv:Start INFO [2025-10-08 15:50:42] ::plot_cnv:Current data dimensions (r,c)=14007,575 Total=8060977.94547191 Min=0.827673338207976 Max=1.15777732284435. INFO [2025-10-08 15:50:43] ::plot_cnv:Depending on the size of the matrix this may take a moment. INFO [2025-10-08 15:50:43] plot_cnv(): auto thresholding at: (0.968317 , 1.033409) INFO [2025-10-08 15:50:43] plot_cnv_observation:Start INFO [2025-10-08 15:50:43] Observation data size: Cells= 498 Genes= 14007 INFO [2025-10-08 15:50:43] plot_cnv_observation:Writing observation groupings/color. INFO [2025-10-08 15:50:43] plot_cnv_observation:Done writing observation groupings/color. INFO [2025-10-08 15:50:43] plot_cnv_observation:Writing observation heatmap thresholds. INFO [2025-10-08 15:50:43] plot_cnv_observation:Done writing observation heatmap thresholds. INFO [2025-10-08 15:50:44] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000 INFO [2025-10-08 15:50:44] Quantiles of plotted data range: 0.968317166671738,1.00003843897153,1.00003843897153,1.00003843897153,1.03340940953818 INFO [2025-10-08 15:50:44] plot_cnv_references:Start INFO [2025-10-08 15:50:44] Reference data size: Cells= 77 Genes= 14007 INFO [2025-10-08 15:50:44] plot_cnv_references:Number reference groups= 1 INFO [2025-10-08 15:50:44] plot_cnv_references:Plotting heatmap. INFO [2025-10-08 15:50:44] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000 INFO [2025-10-08 15:50:44] Quantiles of plotted data range: 0.968317166671738,1.00003843897153,1.00003843897153,1.00003843897153,1.03340940953818 $cluster_by_groups [1] TRUE $k_obs_groups [1] 1 $contig_cex [1] 1 $x.center [1] 1.000863 $x.range [1] 0.9683172 1.0334094 $hclust_method [1] "ward.D" $color_safe_pal [1] FALSE $output_format [1] "png" $png_res [1] 300 $dynamic_resize [1] 0
Chapter 5: Integration¶
- Since ´infercnv´ was not able to create a full map of CNVs from this data. In this section, I have performed a single cell data integration of Tyser's et al. (2021) data together with Petropolus et. al (2016) data from this study, the data were obtained from this site Petropoulos & Lanner Labs. This site contains data derived from more than 30 papers, refined for integration, including the two studies we're targeting.¶
If you maintained the working directory and downloaded the counts and metadata files for both studies you can run the following lines!¶
#Metadata
metadata_tyser = pd.read_table("Tyser_2021_PMID_34789876.meta.tsv", index_col= "cell")
metadata_petro = pd.read_table("Petropoulos_2016_PMID_27062923.meta.tsv", index_col= "cell")
#counts
counts_tyser = pd.read_csv("Tyser_2021_PMID_34789876.counts.gz", index_col= "Gene")
counts_petro = pd.read_csv("Petropoulos_2016_PMID_27062923.counts.gz", index_col= "Gene")
# Transponse the counts to fit AnnData structure
counts_tyser = counts_tyser.T
counts_petro = counts_petro.T
# Build AnnData objects
adata_petro = ad.AnnData(
X=counts_petro.values, # expression matrix
obs=metadata_petro, # cell metadata
var=pd.DataFrame(index=counts_petro.columns) # gene metadata
)
adata_tyser = ad.AnnData(
X=counts_tyser.values, # expression matrix
obs=metadata_tyser, # cell metadata
var=pd.DataFrame(index=counts_tyser.columns) # gene metadata
)
adata_petro.var_names_make_unique()
sc.pp.calculate_qc_metrics(adata_petro, inplace=True)
sc.pp.normalize_total(adata_petro)
sc.pp.log1p(adata_petro)
sc.pp.scale(adata_petro, max_value=10)
sc.tl.pca(adata_petro, n_comps=30, random_state=42)
sc.pp.neighbors(adata_petro, n_neighbors=20, n_pcs=30, random_state=42)
sc.tl.umap(adata_petro, random_state=42)
normalizing counts per cell
finished (0:00:00)
computing PCA
with n_comps=30
finished (0:00:01)
computing neighbors
using 'X_pca' with n_pcs = 30
finished: added to `.uns['neighbors']`
`.obsp['distances']`, distances for each pair of neighbors
`.obsp['connectivities']`, weighted adjacency matrix (0:00:03)
computing UMAP
finished: added
'X_umap', UMAP coordinates (adata.obsm)
'umap', UMAP parameters (adata.uns) (0:00:04)
adata_tyser.var_names_make_unique()
sc.pp.calculate_qc_metrics(adata_tyser, inplace=True)
sc.pp.normalize_total(adata_tyser)
sc.pp.log1p(adata_tyser)
sc.pp.scale(adata_tyser, max_value=10)
sc.tl.pca(adata_tyser, n_comps=30,random_state=42)
sc.pp.neighbors(adata_tyser, n_neighbors=20, n_pcs=30,random_state=42)
sc.tl.umap(adata_tyser,random_state=42)
normalizing counts per cell
finished (0:00:00)
computing PCA
with n_comps=30
finished (0:00:01)
computing neighbors
using 'X_pca' with n_pcs = 30
finished: added to `.uns['neighbors']`
`.obsp['distances']`, distances for each pair of neighbors
`.obsp['connectivities']`, weighted adjacency matrix (0:00:00)
computing UMAP
finished: added
'X_umap', UMAP coordinates (adata.obsm)
'umap', UMAP parameters (adata.uns) (0:00:04)
sc.pp.highly_variable_genes(adata_petro)
sc.pp.highly_variable_genes(adata_tyser)
extracting highly variable genes
finished (0:00:00)
--> added
'highly_variable', boolean vector (adata.var)
'means', float vector (adata.var)
'dispersions', float vector (adata.var)
'dispersions_norm', float vector (adata.var)
extracting highly variable genes
finished (0:00:00)
--> added
'highly_variable', boolean vector (adata.var)
'means', float vector (adata.var)
'dispersions', float vector (adata.var)
'dispersions_norm', float vector (adata.var)
sc.tl.leiden(adata_petro, resolution= 0.75, flavor="igraph", n_iterations=2,random_state=42)
sc.tl.leiden(adata_tyser, resolution= 0.75, flavor="igraph", n_iterations=2,random_state=42)
running Leiden clustering
finished: found 13 clusters and added
'leiden', the cluster labels (adata.obs, categorical) (0:00:00)
running Leiden clustering
finished: found 10 clusters and added
'leiden', the cluster labels (adata.obs, categorical) (0:00:00)
fig, axs = plt.subplots(2, 2, figsize=(10,7), dpi=120)
sc.pl.umap(adata_tyser, color="leiden", ax=axs[0,0], show=False)
sc.pl.umap(adata_tyser, color="raw_annotation", ax=axs[0,1], show=False)
sc.pl.umap(adata_petro, color="leiden", ax=axs[1,0], show=False)
sc.pl.umap(adata_petro, color="raw_annotation", ax=axs[1,1], show=False)
plt.show()
Maintain shared genes between two objects before integration¶
shared_genes = adata_tyser.var_names.intersection(adata_petro.var_names)
adata_tyser = adata_tyser[:, shared_genes]
adata_petro = adata_petro[:, shared_genes]
print(f"Shared genes: {len(shared_genes)}")
Shared genes: 33501
Merge objects¶
adatas_merged = ad.concat([adata_tyser, adata_petro], merge="same")
#adatas_merged.obs #2363 rows × 17 columns
Apply independent normalization and scaling¶
adatas_merged.var_names_make_unique()
sc.pp.calculate_qc_metrics(adatas_merged, inplace=True)
sc.pp.scale(adatas_merged, max_value=10)
sc.tl.pca(adatas_merged, n_comps=30, random_state=42)
sc.pp.neighbors(adatas_merged, n_neighbors= 50, n_pcs=30, random_state=42)
sc.tl.umap(adatas_merged, random_state=42)
sc.pp.highly_variable_genes(adatas_merged)
sc.tl.leiden(adatas_merged, resolution= 0.75, random_state=42)
/Users/mohammedkhattab/Desktop/venv/lib/python3.12/site-packages/pandas/core/arraylike.py:399: RuntimeWarning: invalid value encountered in log1p result = getattr(ufunc, method)(*inputs, **kwargs) /Users/mohammedkhattab/Desktop/venv/lib/python3.12/site-packages/pandas/core/arraylike.py:399: RuntimeWarning: invalid value encountered in log1p result = getattr(ufunc, method)(*inputs, **kwargs)
computing PCA
with n_comps=30
finished (0:00:03)
computing neighbors
using 'X_pca' with n_pcs = 30
finished: added to `.uns['neighbors']`
`.obsp['distances']`, distances for each pair of neighbors
`.obsp['connectivities']`, weighted adjacency matrix (0:00:00)
computing UMAP
finished: added
'X_umap', UMAP coordinates (adata.obsm)
'umap', UMAP parameters (adata.uns) (0:00:22)
extracting highly variable genes
finished (0:00:01)
--> added
'highly_variable', boolean vector (adata.var)
'means', float vector (adata.var)
'dispersions', float vector (adata.var)
'dispersions_norm', float vector (adata.var)
running Leiden clustering
/var/folders/sm/v_yn0d7j7xl1n8zc_c47_tgc0000gn/T/ipykernel_8442/573973735.py:9: FutureWarning: In the future, the default backend for leiden will be igraph instead of leidenalg. To achieve the future defaults please pass: flavor="igraph" and n_iterations=2. directed must also be False to work with igraph's implementation. sc.tl.leiden(adatas_merged, resolution= 0.75, random_state=42)
finished: found 11 clusters and added
'leiden', the cluster labels (adata.obs, categorical) (0:00:00)
Unintegrated¶
fig, axs = plt.subplots(1, 2, figsize=(15, 4), dpi=120)
# Example: one UMAP by dataset, one by raw annotation
sc.pl.umap(adatas_merged, color="leiden", ax=axs[0], show=False)
sc.pl.umap(adatas_merged, color="raw_annotation", ax=axs[1], show=False)
plt.tight_layout()
plt.show()
Run Scanorama correction¶
adatas = [adata_tyser, adata_petro]
corrected_mats, genes = scanorama.correct(
[ad.X for ad in adatas],
[ad.var_names for ad in adatas],
return_dimred=False
)
# Convert gene list to Index
shared_genes = pd.Index(genes)
# Subset both to the same genes and replace expression
for i in range(len(adatas)):
adatas[i] = adatas[i][:, shared_genes]
adatas[i].X = corrected_mats[i]
# Merge integrated AnnData
adata_integrated = adatas[0].concatenate(
adatas[1],
batch_key="dataset",
batch_categories=["Tyser", "Petropoulos"],
index_unique=None
)
print(adata_integrated)
Found 33501 genes among all datasets [[0. 0.41] [0. 0. ]] Processing datasets (0, 1)
/var/folders/sm/v_yn0d7j7xl1n8zc_c47_tgc0000gn/T/ipykernel_8442/1314771505.py:15: UserWarning: Trying to set a dense array with a sparse array on a view.Densifying the sparse array.This may incur excessive memory usage adatas[i].X = corrected_mats[i] /var/folders/sm/v_yn0d7j7xl1n8zc_c47_tgc0000gn/T/ipykernel_8442/1314771505.py:15: ImplicitModificationWarning: Modifying `X` on a view results in data being overridden adatas[i].X = corrected_mats[i] /var/folders/sm/v_yn0d7j7xl1n8zc_c47_tgc0000gn/T/ipykernel_8442/1314771505.py:18: FutureWarning: Use anndata.concat instead of AnnData.concatenate, AnnData.concatenate is deprecated and will be removed in the future. See the tutorial for concat at: https://anndata.readthedocs.io/en/latest/concatenation.html adata_integrated = adatas[0].concatenate(
AnnData object with n_obs × n_vars = 2363 × 33501
obs: 'devTime', 'raw_annotation', 'pred_annotation', 'sub_pred_annotation', 'prediction_score_max', 'pred_psdt', 'proj_UMAP_1', 'proj_UMAP_2', 'n_genes_by_counts', 'log1p_n_genes_by_counts', 'total_counts', 'log1p_total_counts', 'pct_counts_in_top_50_genes', 'pct_counts_in_top_100_genes', 'pct_counts_in_top_200_genes', 'pct_counts_in_top_500_genes', 'leiden', 'dataset'
var: 'n_cells_by_counts-Petropoulos', 'mean_counts-Petropoulos', 'log1p_mean_counts-Petropoulos', 'pct_dropout_by_counts-Petropoulos', 'total_counts-Petropoulos', 'log1p_total_counts-Petropoulos', 'mean-Petropoulos', 'std-Petropoulos', 'highly_variable-Petropoulos', 'means-Petropoulos', 'dispersions-Petropoulos', 'dispersions_norm-Petropoulos', 'n_cells_by_counts-Tyser', 'mean_counts-Tyser', 'log1p_mean_counts-Tyser', 'pct_dropout_by_counts-Tyser', 'total_counts-Tyser', 'log1p_total_counts-Tyser', 'mean-Tyser', 'std-Tyser', 'highly_variable-Tyser', 'means-Tyser', 'dispersions-Tyser', 'dispersions_norm-Tyser'
obsm: 'X_pca', 'X_umap'
adata_integrated.shape # should show (n_cells_total, ~shared_genes)
adata_integrated.obs["dataset"].value_counts()
dataset Petropoulos 1193 Tyser 1170 Name: count, dtype: int64
sc.pp.scale(adata_integrated, max_value=10)
sc.tl.pca(adata_integrated, n_comps=30, random_state=42)
sc.pp.neighbors(adata_integrated, n_neighbors= 50, n_pcs=30, random_state=42)
sc.tl.umap(adata_integrated, random_state=42)
computing PCA
with n_comps=30
finished (0:00:03)
computing neighbors
using 'X_pca' with n_pcs = 30
finished: added to `.uns['neighbors']`
`.obsp['distances']`, distances for each pair of neighbors
`.obsp['connectivities']`, weighted adjacency matrix (0:00:00)
computing UMAP
finished: added
'X_umap', UMAP coordinates (adata.obsm)
'umap', UMAP parameters (adata.uns) (0:00:22)
Integrated UMAP¶
fig, axs = plt.subplots(1, 2, figsize=(15, 4), dpi=120)
# Example: one UMAP by dataset, one by raw annotation
sc.pl.umap(adata_integrated, color="dataset", ax=axs[0], show=False)
sc.pl.umap(adata_integrated, color="raw_annotation", ax=axs[1], show=False)
plt.tight_layout()
plt.show()
sc.pl.umap(adata_integrated, color=["MESP1", "SOX2"], ncols=2)
Prepare files for inferCNV¶
If wanted save a version of the Integrated AnnData object¶
adata_integrated.write("adata_obj_Integrated.h5ad")
#add metadata
adata_integrated.obs.to_csv("metadata_integrated.csv")
%%R
metadata_integrated <- read_csv("metadata_integrated.csv")
Rows: 2363 Columns: 19 ── Column specification ──────────────────────────────────────────────────────── Delimiter: "," chr (6): cell, devTime, raw_annotation, pred_annotation, sub_pred_annotatio... dbl (13): prediction_score_max, pred_psdt, proj_UMAP_1, proj_UMAP_2, n_genes... ℹ Use `spec()` to retrieve the full column specification for this data. ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message. # A tibble: 2,363 × 19 cell devTime raw_annotation pred_annotation sub_pred_annotation <chr> <chr> <chr> <chr> <chr> 1 sc7785290 CS7 HEP HEP HEP 2 sc7786612 CS7 DE Ambiguous Ambiguous 3 sc7786605 CS7 AdvMes AdvMes AdvMes 4 sc7785737 CS7 PriS Ambiguous Ambiguous 5 sc7785398 CS7 ExE_Mes ExE_Mes ExE_Mes 6 sc7788091 CS7 Axial Mes Axial Mes Axial Mes 7 sc7785785 CS7 Erythroblasts Erythroblasts Erythroblasts 8 sc7785959 CS7 ExE_Mes ExE_Mes ExE_Mes 9 sc7785611 CS7 YSE YSE YSE 10 sc7786585 CS7 Mesoderm Mesoderm Mesoderm # ℹ 2,353 more rows # ℹ 14 more variables: prediction_score_max <dbl>, pred_psdt <dbl>, # proj_UMAP_1 <dbl>, proj_UMAP_2 <dbl>, n_genes_by_counts <dbl>, # log1p_n_genes_by_counts <dbl>, total_counts <dbl>, # log1p_total_counts <dbl>, pct_counts_in_top_50_genes <dbl>, # pct_counts_in_top_100_genes <dbl>, pct_counts_in_top_200_genes <dbl>, # pct_counts_in_top_500_genes <dbl>, leiden <dbl>, dataset <chr> # ℹ Use `print(n = ...)` to see more rows
annotated_cluster_integrated = adata_integrated.obs[['raw_annotation']].copy()
| raw_annotation | |
|---|---|
| cell | |
| sc7785290 | HEP |
| sc7786612 | DE |
| sc7786605 | AdvMes |
| sc7785737 | PriS |
| sc7785398 | ExE_Mes |
| ... | ... |
| E7.9.564 | TE |
| E7.9.567 | TE |
| E7.9.568 | TE |
| E7.9.570 | TE |
| E7.9.573 | TE |
2363 rows × 1 columns
annotated_cluster_integrated.to_csv("cell_annotations_integrated.tsv", sep="\t", header=False)
merged_counts = pd.concat([counts_petro, counts_tyser], axis=0)
merged_counts #2363 rows × 33501 columns
| Gene | MIR1302-2HG | FAM138A | OR4F5 | AL627309.1 | AL627309.3 | AL627309.2 | AL627309.4 | AL732372.1 | OR4F29 | AC114498.1 | ... | AC007325.2 | BX072566.1 | AL354822.1 | AC023491.2 | AC004556.1 | AC233755.2 | AC233755.1 | AC240274.1 | AC213203.1 | FAM231C |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| E3.1.443 | 0 | 0 | 0 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 3 | 9 | 0 | 0 |
| E3.1.444 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 6 | 0 | 0 |
| E3.1.445 | 0 | 0 | 0 | 18 | 4 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 |
| E3.1.447 | 0 | 0 | 0 | 9 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| E3.1.448 | 0 | 0 | 0 | 10 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| sc7785965 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| sc7788259 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| sc7786123 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| sc7786212 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| sc7785932 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
2363 rows × 33501 columns
merged_counts.T.to_csv("raw_counts_integrated.tsv", sep="\t")
%%R
raw_counts_int <- read.table("raw_counts_integrated.tsv", sep="\t", header=TRUE, row.names=1, check.names=FALSE)
head(rownames(raw_counts_int))
[1] "MIR1302-2HG" "FAM138A" "OR4F5" "AL627309.1" "AL627309.3" [6] "AL627309.2"
%%R
annotation_file <- read_tsv("cell_annotations_integrated.tsv")
annotation_file
Rows: 2362 Columns: 2 ── Column specification ──────────────────────────────────────────────────────── Delimiter: "\t" chr (2): sc7785290, HEP ℹ Use `spec()` to retrieve the full column specification for this data. ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message. # A tibble: 2,362 × 2 sc7785290 HEP <chr> <chr> 1 sc7786612 DE 2 sc7786605 AdvMes 3 sc7785737 PriS 4 sc7785398 ExE_Mes 5 sc7788091 Axial Mes 6 sc7785785 Erythroblasts 7 sc7785959 ExE_Mes 8 sc7785611 YSE 9 sc7786585 Mesoderm 10 sc7785580 ExE_Mes # ℹ 2,352 more rows # ℹ Use `print(n = ...)` to see more rows
%%R
#Gene order file found online as mentioned above
gene_order <- read_tsv("gencode_v19_gene_pos.txt")
gene_order
%%R
colnames(gene_order) <- c("gene","chr","start","end")
# Optional: save clean version for inferCNV (no header, 4 cols)
write.table(gene_order, "gene_order_integrated.tsv",
sep="\t", quote=FALSE, row.names=FALSE, col.names=FALSE)
%%R
#make sure the gene_order have same genes as in raw_counts to prevent errors
gene_order_updated <- subset(gene_order, gene %in% rownames(raw_counts))
gene_order_updated <- gene_order_updated[match(rownames(raw_counts), gene_order_updated$gene), ]
head(gene_order_updated)
# A tibble: 6 × 4 gene chr start end <chr> <chr> <dbl> <dbl> 1 A1BG chr19 58856544 58864865 2 A1CF chr10 52559169 52645435 3 A2M chr12 9220260 9268825 4 A2ML1 chr12 8975068 9039597 5 A2MP1 chr12 9381129 9428413 6 A3GALT2 chr1 33772367 33786699
%%R
# Remove any rows with NA values
gene_order_updated <- na.omit(gene_order_updated)
head(gene_order_updated)
# A tibble: 6 × 4 gene chr start end <chr> <chr> <dbl> <dbl> 1 A1BG chr19 58856544 58864865 2 A1CF chr10 52559169 52645435 3 A2M chr12 9220260 9268825 4 A2ML1 chr12 8975068 9039597 5 A2MP1 chr12 9381129 9428413 6 A3GALT2 chr1 33772367 33786699
%%R
# Keep only genes that exist in gene_order
common_genes <- intersect(rownames(raw_counts), gene_order_updated$gene)
# Subset expression matrix
raw_counts <- raw_counts[common_genes, ]
gene_order_final <- gene_order_updated[match(rownames(raw_counts), gene_order_updated$gene), ]
# Sanity check
all(rownames(raw_counts) == gene_order_final$gene)
[1] TRUE
%%R
# Optional: save clean version for inferCNV (no header, 4 cols)
write.table(gene_order, "gene_order_integrated_final.tsv",
sep="\t", quote=FALSE, row.names=FALSE, col.names=FALSE)
InferCNV for integrated dataset¶
%%R
infercnv_obj <- CreateInfercnvObject(raw_counts_matrix = as.matrix(raw_counts_int),
annotations_file = "cell_annotations_integrated.tsv",
gene_order_file = "gene_order_integrated_final.tsv",
delim = "\t",
ref_group_names = "Epiblast" )
INFO [2025-10-08 16:10:40] Parsing gene order file: gene_order_integrated_final.tsv INFO [2025-10-08 16:10:40] Parsing cell annotations file: cell_annotations_integrated.tsv INFO [2025-10-08 16:10:40] ::order_reduce:Start. INFO [2025-10-08 16:10:40] .order_reduce(): expr and order match. INFO [2025-10-08 16:10:40] ::process_data:order_reduce:Reduction from positional data, new dimensions (r,c) = 33501,2363 Total=4698023250 Min=0 Max=507967. INFO [2025-10-08 16:10:40] num genes removed taking into account provided gene ordering list: 12754 = 38.070505358049% removed. INFO [2025-10-08 16:10:40] -filtering out cells < 100 or > Inf, removing 0 % of cells WARN [2025-10-08 16:10:41] Please use "options(scipen = 100)" before running infercnv if you are using the analysis_mode="subclusters" option or you may encounter an error while the hclust is being generated. INFO [2025-10-08 16:10:42] validating infercnv_obj
%%R
infercnv_obj_run <- infercnv::run(
infercnv_obj,
out_dir = "output_dir_integrated",
cutoff = 1, # Works well with SmartSeq2
min_cells_per_gene = 10, # relax cell filter
HMM = T,
analysis_mode="subclusters", #inferCNV will attempt to find subpopulations with distinct CNV patterns, rather than assuming each provided group is uniform
denoise = T)
INFO [2025-10-08 16:10:42] ::process_data:Start INFO [2025-10-08 16:10:42] Creating output path output_dir_integrated INFO [2025-10-08 16:10:42] Checking for saved results. INFO [2025-10-08 16:10:42] STEP 1: incoming data INFO [2025-10-08 16:11:00] STEP 02: Removing lowly expressed genes INFO [2025-10-08 16:11:00] ::above_min_mean_expr_cutoff:Start INFO [2025-10-08 16:11:00] Removing 6993 genes from matrix as below mean expr threshold: 1 INFO [2025-10-08 16:11:00] validating infercnv_obj INFO [2025-10-08 16:11:00] There are 13754 genes and 2363 cells remaining in the expr matrix. INFO [2025-10-08 16:11:01] no genes removed due to min cells/gene filter INFO [2025-10-08 16:11:18] STEP 03: normalization by sequencing depth INFO [2025-10-08 16:11:18] normalizing counts matrix by depth INFO [2025-10-08 16:11:19] Computed total sum normalization factor as median libsize: 473728.000000 INFO [2025-10-08 16:11:19] Adding h-spike INFO [2025-10-08 16:11:19] -hspike modeling of Epiblast INFO [2025-10-08 16:13:26] validating infercnv_obj INFO [2025-10-08 16:13:26] normalizing counts matrix by depth INFO [2025-10-08 16:13:26] Using specified normalization factor: 473728.000000 INFO [2025-10-08 16:13:40] STEP 04: log transformation of data INFO [2025-10-08 16:13:40] transforming log2xplus1() INFO [2025-10-08 16:13:40] -mirroring for hspike INFO [2025-10-08 16:13:40] transforming log2xplus1() INFO [2025-10-08 16:13:55] STEP 08: removing average of reference data (before smoothing) INFO [2025-10-08 16:13:55] ::subtract_ref_expr_from_obs:Start inv_log=FALSE, use_bounds=TRUE INFO [2025-10-08 16:13:55] subtracting mean(normal) per gene per cell across all data INFO [2025-10-08 16:13:57] -subtracting expr per gene, use_bounds=TRUE INFO [2025-10-08 16:13:59] -mirroring for hspike INFO [2025-10-08 16:13:59] ::subtract_ref_expr_from_obs:Start inv_log=FALSE, use_bounds=TRUE INFO [2025-10-08 16:13:59] subtracting mean(normal) per gene per cell across all data INFO [2025-10-08 16:14:02] -subtracting expr per gene, use_bounds=TRUE INFO [2025-10-08 16:14:21] STEP 09: apply max centered expression threshold: 3 INFO [2025-10-08 16:14:21] ::process_data:setting max centered expr, threshold set to: +/-: 3 INFO [2025-10-08 16:14:22] -mirroring for hspike INFO [2025-10-08 16:14:22] ::process_data:setting max centered expr, threshold set to: +/-: 3 INFO [2025-10-08 16:14:41] STEP 10: Smoothing data per cell by chromosome INFO [2025-10-08 16:14:41] smooth_by_chromosome: chr: chr1 INFO [2025-10-08 16:14:44] smooth_by_chromosome: chr: chr2 INFO [2025-10-08 16:14:47] smooth_by_chromosome: chr: chr3 INFO [2025-10-08 16:14:49] smooth_by_chromosome: chr: chr4 INFO [2025-10-08 16:14:51] smooth_by_chromosome: chr: chr5 INFO [2025-10-08 16:14:54] smooth_by_chromosome: chr: chr6 INFO [2025-10-08 16:14:55] smooth_by_chromosome: chr: chr7 INFO [2025-10-08 16:14:58] smooth_by_chromosome: chr: chr8 INFO [2025-10-08 16:14:59] smooth_by_chromosome: chr: chr9 INFO [2025-10-08 16:15:01] smooth_by_chromosome: chr: chr10 INFO [2025-10-08 16:15:04] smooth_by_chromosome: chr: chr11 INFO [2025-10-08 16:15:05] smooth_by_chromosome: chr: chr12 INFO [2025-10-08 16:15:07] smooth_by_chromosome: chr: chr13 INFO [2025-10-08 16:15:09] smooth_by_chromosome: chr: chr14 INFO [2025-10-08 16:15:13] smooth_by_chromosome: chr: chr15 INFO [2025-10-08 16:15:14] smooth_by_chromosome: chr: chr16 INFO [2025-10-08 16:15:16] smooth_by_chromosome: chr: chr17 INFO [2025-10-08 16:15:18] smooth_by_chromosome: chr: chr18 INFO [2025-10-08 16:15:20] smooth_by_chromosome: chr: chr19 INFO [2025-10-08 16:15:24] smooth_by_chromosome: chr: chr20 INFO [2025-10-08 16:15:25] smooth_by_chromosome: chr: chr21 INFO [2025-10-08 16:15:27] smooth_by_chromosome: chr: chr22 INFO [2025-10-08 16:15:29] -mirroring for hspike INFO [2025-10-08 16:15:29] smooth_by_chromosome: chr: chrA INFO [2025-10-08 16:15:29] smooth_by_chromosome: chr: chr_0 INFO [2025-10-08 16:15:30] smooth_by_chromosome: chr: chr_B INFO [2025-10-08 16:15:30] smooth_by_chromosome: chr: chr_0pt5 INFO [2025-10-08 16:15:30] smooth_by_chromosome: chr: chr_C INFO [2025-10-08 16:15:30] smooth_by_chromosome: chr: chr_1pt5 INFO [2025-10-08 16:15:31] smooth_by_chromosome: chr: chr_D INFO [2025-10-08 16:15:31] smooth_by_chromosome: chr: chr_2pt0 INFO [2025-10-08 16:15:31] smooth_by_chromosome: chr: chr_E INFO [2025-10-08 16:15:31] smooth_by_chromosome: chr: chr_3pt0 INFO [2025-10-08 16:15:31] smooth_by_chromosome: chr: chr_F INFO [2025-10-08 16:15:53] STEP 11: re-centering data across chromosome after smoothing INFO [2025-10-08 16:15:53] ::center_smooth across chromosomes per cell INFO [2025-10-08 16:15:56] -mirroring for hspike INFO [2025-10-08 16:15:56] ::center_smooth across chromosomes per cell INFO [2025-10-08 16:16:17] STEP 12: removing average of reference data (after smoothing) INFO [2025-10-08 16:16:17] ::subtract_ref_expr_from_obs:Start inv_log=FALSE, use_bounds=TRUE INFO [2025-10-08 16:16:17] subtracting mean(normal) per gene per cell across all data INFO [2025-10-08 16:16:19] -subtracting expr per gene, use_bounds=TRUE INFO [2025-10-08 16:16:21] -mirroring for hspike INFO [2025-10-08 16:16:21] ::subtract_ref_expr_from_obs:Start inv_log=FALSE, use_bounds=TRUE INFO [2025-10-08 16:16:21] subtracting mean(normal) per gene per cell across all data INFO [2025-10-08 16:16:23] -subtracting expr per gene, use_bounds=TRUE INFO [2025-10-08 16:16:43] STEP 14: invert log2(FC) to FC INFO [2025-10-08 16:16:43] invert_log2(), computing 2^x INFO [2025-10-08 16:16:44] -mirroring for hspike INFO [2025-10-08 16:16:44] invert_log2(), computing 2^x INFO [2025-10-08 16:17:04] STEP 15: computing tumor subclusters via leiden INFO [2025-10-08 16:17:04] define_signif_tumor_subclusters(p_val=0.1 INFO [2025-10-08 16:17:05] define_signif_tumor_subclusters(), tumor: 8 cell INFO [2025-10-08 16:17:05] Setting auto leiden resolution for 8 cell to 0.237326 INFO [2025-10-08 16:17:06] define_signif_tumor_subclusters(), tumor: AdvMes INFO [2025-10-08 16:17:06] Setting auto leiden resolution for AdvMes to 0.153905 INFO [2025-10-08 16:17:06] define_signif_tumor_subclusters(), tumor: Amnion.Ecto INFO [2025-10-08 16:17:06] Less cells in group Amnion.Ecto than k_nn setting. Keeping as a single subcluster. INFO [2025-10-08 16:17:06] define_signif_tumor_subclusters(), tumor: Axial Mes INFO [2025-10-08 16:17:06] Setting auto leiden resolution for Axial Mes to 0.593499 INFO [2025-10-08 16:17:07] define_signif_tumor_subclusters(), tumor: DE INFO [2025-10-08 16:17:07] Setting auto leiden resolution for DE to 0.279029 INFO [2025-10-08 16:17:08] define_signif_tumor_subclusters(), tumor: EPI.PrE.INT INFO [2025-10-08 16:17:08] Less cells in group EPI.PrE.INT than k_nn setting. Keeping as a single subcluster. INFO [2025-10-08 16:17:08] define_signif_tumor_subclusters(), tumor: Erythroblasts INFO [2025-10-08 16:17:08] Setting auto leiden resolution for Erythroblasts to 0.430269 INFO [2025-10-08 16:17:09] define_signif_tumor_subclusters(), tumor: ExE_Mes INFO [2025-10-08 16:17:09] Setting auto leiden resolution for ExE_Mes to 0.125055 INFO [2025-10-08 16:17:10] define_signif_tumor_subclusters(), tumor: HEP INFO [2025-10-08 16:17:10] Setting auto leiden resolution for HEP to 0.149089 INFO [2025-10-08 16:17:11] define_signif_tumor_subclusters(), tumor: Hypoblast INFO [2025-10-08 16:17:11] Setting auto leiden resolution for Hypoblast to 0.21198 INFO [2025-10-08 16:17:12] define_signif_tumor_subclusters(), tumor: ICM INFO [2025-10-08 16:17:12] Setting auto leiden resolution for ICM to 0.468206 INFO [2025-10-08 16:17:13] define_signif_tumor_subclusters(), tumor: Late_Amnion INFO [2025-10-08 16:17:13] Setting auto leiden resolution for Late_Amnion to 0.482523 INFO [2025-10-08 16:17:13] define_signif_tumor_subclusters(), tumor: Mesoderm INFO [2025-10-08 16:17:13] Setting auto leiden resolution for Mesoderm to 0.06727 INFO [2025-10-08 16:17:15] define_signif_tumor_subclusters(), tumor: Morula INFO [2025-10-08 16:17:15] Setting auto leiden resolution for Morula to 0.107509 INFO [2025-10-08 16:17:16] define_signif_tumor_subclusters(), tumor: Prelineage INFO [2025-10-08 16:17:16] Setting auto leiden resolution for Prelineage to 0.153905 INFO [2025-10-08 16:17:17] define_signif_tumor_subclusters(), tumor: PriS INFO [2025-10-08 16:17:17] Setting auto leiden resolution for PriS to 0.100576 INFO [2025-10-08 16:17:19] define_signif_tumor_subclusters(), tumor: TE INFO [2025-10-08 16:17:19] Setting auto leiden resolution for TE to 0.0307126 INFO [2025-10-08 16:17:21] define_signif_tumor_subclusters(), tumor: YSE INFO [2025-10-08 16:17:21] Setting auto leiden resolution for YSE to 0.283628 INFO [2025-10-08 16:17:22] define_signif_tumor_subclusters(), tumor: Epiblast INFO [2025-10-08 16:17:22] Setting auto leiden resolution for Epiblast to 0.0852357 INFO [2025-10-08 16:17:23] -mirroring for hspike INFO [2025-10-08 16:17:23] define_signif_tumor_subclusters(p_val=0.1 INFO [2025-10-08 16:17:23] define_signif_tumor_subclusters(), tumor: spike_tumor_cell_Epiblast INFO [2025-10-08 16:17:23] cut tree into: 1 groups INFO [2025-10-08 16:17:23] -processing spike_tumor_cell_Epiblast,spike_tumor_cell_Epiblast_s1 INFO [2025-10-08 16:17:23] define_signif_tumor_subclusters(), tumor: simnorm_cell_Epiblast INFO [2025-10-08 16:17:23] cut tree into: 1 groups INFO [2025-10-08 16:17:23] -processing simnorm_cell_Epiblast,simnorm_cell_Epiblast_s1 INFO [2025-10-08 16:17:44] ::plot_cnv:Start INFO [2025-10-08 16:17:44] ::plot_cnv:Current data dimensions (r,c)=13754,2363 Total=32934377.1150229 Min=0.202126164889467 Max=3.31282454939968. INFO [2025-10-08 16:17:44] ::plot_cnv:Depending on the size of the matrix this may take a moment. INFO [2025-10-08 16:17:45] plot_cnv(): auto thresholding at: (0.554558 , 1.472130) INFO [2025-10-08 16:17:45] plot_cnv_observation:Start INFO [2025-10-08 16:17:45] Observation data size: Cells= 2152 Genes= 13754 INFO [2025-10-08 16:17:45] clustering observations via method: ward.D INFO [2025-10-08 16:17:45] Number of cells in group(1) is 41 INFO [2025-10-08 16:17:45] group size being clustered: 41,13754 INFO [2025-10-08 16:17:45] Number of cells in group(2) is 23 INFO [2025-10-08 16:17:45] group size being clustered: 23,13754 INFO [2025-10-08 16:17:46] Number of cells in group(3) is 43 INFO [2025-10-08 16:17:46] group size being clustered: 43,13754 INFO [2025-10-08 16:17:46] Number of cells in group(4) is 31 INFO [2025-10-08 16:17:46] group size being clustered: 31,13754 INFO [2025-10-08 16:17:46] Number of cells in group(5) is 28 INFO [2025-10-08 16:17:46] group size being clustered: 28,13754 INFO [2025-10-08 16:17:46] Number of cells in group(6) is 3 INFO [2025-10-08 16:17:46] group size being clustered: 3,13754 INFO [2025-10-08 16:17:46] Number of cells in group(7) is 1 INFO [2025-10-08 16:17:46] Skipping group: 7, since less than 2 entries INFO [2025-10-08 16:17:46] Number of cells in group(8) is 20 INFO [2025-10-08 16:17:46] group size being clustered: 20,13754 INFO [2025-10-08 16:17:46] Number of cells in group(9) is 22 INFO [2025-10-08 16:17:46] group size being clustered: 22,13754 INFO [2025-10-08 16:17:46] Number of cells in group(10) is 52 INFO [2025-10-08 16:17:46] group size being clustered: 52,13754 INFO [2025-10-08 16:17:46] Number of cells in group(11) is 1 INFO [2025-10-08 16:17:46] Skipping group: 11, since less than 2 entries INFO [2025-10-08 16:17:46] Number of cells in group(12) is 16 INFO [2025-10-08 16:17:46] group size being clustered: 16,13754 INFO [2025-10-08 16:17:46] Number of cells in group(13) is 32 INFO [2025-10-08 16:17:46] group size being clustered: 32,13754 INFO [2025-10-08 16:17:46] Number of cells in group(14) is 57 INFO [2025-10-08 16:17:46] group size being clustered: 57,13754 INFO [2025-10-08 16:17:46] Number of cells in group(15) is 38 INFO [2025-10-08 16:17:46] group size being clustered: 38,13754 INFO [2025-10-08 16:17:46] Number of cells in group(16) is 28 INFO [2025-10-08 16:17:46] group size being clustered: 28,13754 INFO [2025-10-08 16:17:46] Number of cells in group(17) is 9 INFO [2025-10-08 16:17:46] group size being clustered: 9,13754 INFO [2025-10-08 16:17:46] Number of cells in group(18) is 2 INFO [2025-10-08 16:17:46] group size being clustered: 2,13754 INFO [2025-10-08 16:17:46] Number of cells in group(19) is 1 INFO [2025-10-08 16:17:46] Skipping group: 19, since less than 2 entries INFO [2025-10-08 16:17:46] Number of cells in group(20) is 35 INFO [2025-10-08 16:17:46] group size being clustered: 35,13754 INFO [2025-10-08 16:17:46] Number of cells in group(21) is 33 INFO [2025-10-08 16:17:46] group size being clustered: 33,13754 INFO [2025-10-08 16:17:46] Number of cells in group(22) is 23 INFO [2025-10-08 16:17:46] group size being clustered: 23,13754 INFO [2025-10-08 16:17:46] Number of cells in group(23) is 18 INFO [2025-10-08 16:17:46] group size being clustered: 18,13754 INFO [2025-10-08 16:17:46] Number of cells in group(24) is 1 INFO [2025-10-08 16:17:46] Skipping group: 24, since less than 2 entries INFO [2025-10-08 16:17:46] Number of cells in group(25) is 45 INFO [2025-10-08 16:17:46] group size being clustered: 45,13754 INFO [2025-10-08 16:17:46] Number of cells in group(26) is 28 INFO [2025-10-08 16:17:46] group size being clustered: 28,13754 INFO [2025-10-08 16:17:46] Number of cells in group(27) is 29 INFO [2025-10-08 16:17:46] group size being clustered: 29,13754 INFO [2025-10-08 16:17:46] Number of cells in group(28) is 28 INFO [2025-10-08 16:17:46] group size being clustered: 28,13754 INFO [2025-10-08 16:17:46] Number of cells in group(29) is 95 INFO [2025-10-08 16:17:46] group size being clustered: 95,13754 INFO [2025-10-08 16:17:46] Number of cells in group(30) is 61 INFO [2025-10-08 16:17:46] group size being clustered: 61,13754 INFO [2025-10-08 16:17:46] Number of cells in group(31) is 38 INFO [2025-10-08 16:17:46] group size being clustered: 38,13754 INFO [2025-10-08 16:17:46] Number of cells in group(32) is 33 INFO [2025-10-08 16:17:46] group size being clustered: 33,13754 INFO [2025-10-08 16:17:46] Number of cells in group(33) is 30 INFO [2025-10-08 16:17:46] group size being clustered: 30,13754 INFO [2025-10-08 16:17:46] Number of cells in group(34) is 20 INFO [2025-10-08 16:17:46] group size being clustered: 20,13754 INFO [2025-10-08 16:17:46] Number of cells in group(35) is 1 INFO [2025-10-08 16:17:46] Skipping group: 35, since less than 2 entries INFO [2025-10-08 16:17:46] Number of cells in group(36) is 61 INFO [2025-10-08 16:17:46] group size being clustered: 61,13754 INFO [2025-10-08 16:17:46] Number of cells in group(37) is 31 INFO [2025-10-08 16:17:46] group size being clustered: 31,13754 INFO [2025-10-08 16:17:46] Number of cells in group(38) is 22 INFO [2025-10-08 16:17:46] group size being clustered: 22,13754 INFO [2025-10-08 16:17:46] Number of cells in group(39) is 19 INFO [2025-10-08 16:17:46] group size being clustered: 19,13754 INFO [2025-10-08 16:17:46] Number of cells in group(40) is 17 INFO [2025-10-08 16:17:46] group size being clustered: 17,13754 INFO [2025-10-08 16:17:46] Number of cells in group(41) is 11 INFO [2025-10-08 16:17:46] group size being clustered: 11,13754 INFO [2025-10-08 16:17:46] Number of cells in group(42) is 46 INFO [2025-10-08 16:17:46] group size being clustered: 46,13754 INFO [2025-10-08 16:17:46] Number of cells in group(43) is 24 INFO [2025-10-08 16:17:46] group size being clustered: 24,13754 INFO [2025-10-08 16:17:46] Number of cells in group(44) is 15 INFO [2025-10-08 16:17:46] group size being clustered: 15,13754 INFO [2025-10-08 16:17:46] Number of cells in group(45) is 11 INFO [2025-10-08 16:17:46] group size being clustered: 11,13754 INFO [2025-10-08 16:17:46] Number of cells in group(46) is 10 INFO [2025-10-08 16:17:46] group size being clustered: 10,13754 INFO [2025-10-08 16:17:46] Number of cells in group(47) is 62 INFO [2025-10-08 16:17:46] group size being clustered: 62,13754 INFO [2025-10-08 16:17:46] Number of cells in group(48) is 60 INFO [2025-10-08 16:17:46] group size being clustered: 60,13754 INFO [2025-10-08 16:17:46] Number of cells in group(49) is 39 INFO [2025-10-08 16:17:47] group size being clustered: 39,13754 INFO [2025-10-08 16:17:47] Number of cells in group(50) is 11 INFO [2025-10-08 16:17:47] group size being clustered: 11,13754 INFO [2025-10-08 16:17:47] Number of cells in group(51) is 1 INFO [2025-10-08 16:17:47] Skipping group: 51, since less than 2 entries INFO [2025-10-08 16:17:47] Number of cells in group(52) is 1 INFO [2025-10-08 16:17:47] Skipping group: 52, since less than 2 entries INFO [2025-10-08 16:17:47] Number of cells in group(53) is 90 INFO [2025-10-08 16:17:47] group size being clustered: 90,13754 INFO [2025-10-08 16:17:47] Number of cells in group(54) is 74 INFO [2025-10-08 16:17:47] group size being clustered: 74,13754 INFO [2025-10-08 16:17:47] Number of cells in group(55) is 71 INFO [2025-10-08 16:17:47] group size being clustered: 71,13754 INFO [2025-10-08 16:17:47] Number of cells in group(56) is 70 INFO [2025-10-08 16:17:47] group size being clustered: 70,13754 INFO [2025-10-08 16:17:47] Number of cells in group(57) is 67 INFO [2025-10-08 16:17:47] group size being clustered: 67,13754 INFO [2025-10-08 16:17:47] Number of cells in group(58) is 53 INFO [2025-10-08 16:17:47] group size being clustered: 53,13754 INFO [2025-10-08 16:17:47] Number of cells in group(59) is 49 INFO [2025-10-08 16:17:47] group size being clustered: 49,13754 INFO [2025-10-08 16:17:47] Number of cells in group(60) is 47 INFO [2025-10-08 16:17:47] group size being clustered: 47,13754 INFO [2025-10-08 16:17:47] Number of cells in group(61) is 42 INFO [2025-10-08 16:17:47] group size being clustered: 42,13754 INFO [2025-10-08 16:17:47] Number of cells in group(62) is 39 INFO [2025-10-08 16:17:47] group size being clustered: 39,13754 INFO [2025-10-08 16:17:47] Number of cells in group(63) is 37 INFO [2025-10-08 16:17:47] group size being clustered: 37,13754 INFO [2025-10-08 16:17:47] Number of cells in group(64) is 25 INFO [2025-10-08 16:17:47] group size being clustered: 25,13754 INFO [2025-10-08 16:17:47] Number of cells in group(65) is 21 INFO [2025-10-08 16:17:47] group size being clustered: 21,13754 INFO [2025-10-08 16:17:47] Number of cells in group(66) is 6 INFO [2025-10-08 16:17:47] group size being clustered: 6,13754 INFO [2025-10-08 16:17:47] Number of cells in group(67) is 2 INFO [2025-10-08 16:17:47] group size being clustered: 2,13754 INFO [2025-10-08 16:17:47] Number of cells in group(68) is 35 INFO [2025-10-08 16:17:47] group size being clustered: 35,13754 INFO [2025-10-08 16:17:47] Number of cells in group(69) is 17 INFO [2025-10-08 16:17:47] group size being clustered: 17,13754 INFO [2025-10-08 16:17:47] plot_cnv_observation:Writing observation groupings/color. INFO [2025-10-08 16:17:47] plot_cnv_observation:Done writing observation groupings/color. INFO [2025-10-08 16:17:47] plot_cnv_observation:Writing observation heatmap thresholds. INFO [2025-10-08 16:17:47] plot_cnv_observation:Done writing observation heatmap thresholds. INFO [2025-10-08 16:17:52] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000 INFO [2025-10-08 16:17:52] Quantiles of plotted data range: 0.554557588248085,0.904502433693864,1.0000284369913,1.10680317194265,1.47212953469572 INFO [2025-10-08 16:17:55] plot_cnv_references:Start INFO [2025-10-08 16:17:55] Reference data size: Cells= 211 Genes= 13754 INFO [2025-10-08 16:17:55] plot_cnv_references:Number reference groups= 5 INFO [2025-10-08 16:17:55] plot_cnv_references:Plotting heatmap. INFO [2025-10-08 16:17:56] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000 INFO [2025-10-08 16:17:56] Quantiles of plotted data range: 0.554557588248085,0.911539841658008,1.00010701908823,1.09705757707205,1.47212953469572 INFO [2025-10-08 16:18:16] ::plot_cnv:Start INFO [2025-10-08 16:18:16] ::plot_cnv:Current data dimensions (r,c)=13754,2363 Total=32934377.1150229 Min=0.202126164889467 Max=3.31282454939968. INFO [2025-10-08 16:18:17] ::plot_cnv:Depending on the size of the matrix this may take a moment. INFO [2025-10-08 16:18:17] plot_cnv(): auto thresholding at: (0.554558 , 1.472130) INFO [2025-10-08 16:18:17] plot_cnv_observation:Start INFO [2025-10-08 16:18:17] Observation data size: Cells= 2152 Genes= 13754 INFO [2025-10-08 16:18:18] plot_cnv_observation:Writing observation groupings/color. INFO [2025-10-08 16:18:18] plot_cnv_observation:Done writing observation groupings/color. INFO [2025-10-08 16:18:18] plot_cnv_observation:Writing observation heatmap thresholds. INFO [2025-10-08 16:18:18] plot_cnv_observation:Done writing observation heatmap thresholds. INFO [2025-10-08 16:18:22] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000 INFO [2025-10-08 16:18:22] Quantiles of plotted data range: 0.554557588248085,0.904502433693864,1.0000284369913,1.10680317194265,1.47212953469572 INFO [2025-10-08 16:18:25] plot_cnv_references:Start INFO [2025-10-08 16:18:25] Reference data size: Cells= 211 Genes= 13754 INFO [2025-10-08 16:18:25] plot_cnv_references:Number reference groups= 1 INFO [2025-10-08 16:18:25] plot_cnv_references:Plotting heatmap. INFO [2025-10-08 16:18:26] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000 INFO [2025-10-08 16:18:26] Quantiles of plotted data range: 0.554557588248085,0.911539841658008,1.00010701908823,1.09705757707205,1.47212953469572 INFO [2025-10-08 16:18:26] STEP 17: HMM-based CNV prediction INFO [2025-10-08 16:18:26] predict_CNV_via_HMM_on_tumor_subclusters INFO [2025-10-08 16:18:48] -done predicting CNV based on initial tumor subclusters INFO [2025-10-08 16:18:48] get_predicted_CNV_regions(subcluster) INFO [2025-10-08 16:18:48] -processing cell_group_name: 8 cell.8 cell_s2, size: 41 INFO [2025-10-08 16:18:53] -processing cell_group_name: 8 cell.8 cell_s1, size: 23 INFO [2025-10-08 16:18:57] -processing cell_group_name: AdvMes.AdvMes_s2, size: 43 INFO [2025-10-08 16:19:02] -processing cell_group_name: AdvMes.AdvMes_s3, size: 31 INFO [2025-10-08 16:19:06] -processing cell_group_name: AdvMes.AdvMes_s1, size: 28 INFO [2025-10-08 16:19:11] -processing cell_group_name: AdvMes.AdvMes_s5, size: 3 INFO [2025-10-08 16:19:15] -processing cell_group_name: AdvMes.AdvMes_s4, size: 1 INFO [2025-10-08 16:19:20] -processing cell_group_name: Amnion.Ecto.Amnion.Ecto, size: 20 INFO [2025-10-08 16:19:24] -processing cell_group_name: Axial Mes.Axial Mes_s1, size: 22 INFO [2025-10-08 16:19:29] -processing cell_group_name: DE.DE_s1, size: 52 INFO [2025-10-08 16:19:34] -processing cell_group_name: DE.DE_s2, size: 1 INFO [2025-10-08 16:19:39] -processing cell_group_name: EPI.PrE.INT.EPI.PrE.INT, size: 16 INFO [2025-10-08 16:19:43] -processing cell_group_name: Erythroblasts.Erythroblasts_s1, size: 32 INFO [2025-10-08 16:19:48] -processing cell_group_name: ExE_Mes.ExE_Mes_s1, size: 57 INFO [2025-10-08 16:19:53] -processing cell_group_name: ExE_Mes.ExE_Mes_s2, size: 38 INFO [2025-10-08 16:19:57] -processing cell_group_name: ExE_Mes.ExE_Mes_s3, size: 28 INFO [2025-10-08 16:20:02] -processing cell_group_name: ExE_Mes.ExE_Mes_s4, size: 9 INFO [2025-10-08 16:20:06] -processing cell_group_name: ExE_Mes.ExE_Mes_s5, size: 2 INFO [2025-10-08 16:20:11] -processing cell_group_name: ExE_Mes.ExE_Mes_s6, size: 1 INFO [2025-10-08 16:20:15] -processing cell_group_name: HEP.HEP_s3, size: 35 INFO [2025-10-08 16:20:20] -processing cell_group_name: HEP.HEP_s2, size: 33 INFO [2025-10-08 16:20:24] -processing cell_group_name: HEP.HEP_s4, size: 23 INFO [2025-10-08 16:20:29] -processing cell_group_name: HEP.HEP_s1, size: 18 INFO [2025-10-08 16:20:34] -processing cell_group_name: HEP.HEP_s5, size: 1 INFO [2025-10-08 16:20:38] -processing cell_group_name: Hypoblast.Hypoblast_s1, size: 45 INFO [2025-10-08 16:20:43] -processing cell_group_name: Hypoblast.Hypoblast_s2, size: 28 INFO [2025-10-08 16:20:47] -processing cell_group_name: ICM.ICM_s1, size: 29 INFO [2025-10-08 16:20:52] -processing cell_group_name: Late_Amnion.Late_Amnion_s1, size: 28 INFO [2025-10-08 16:20:57] -processing cell_group_name: Mesoderm.Mesoderm_s2, size: 95 INFO [2025-10-08 16:21:02] -processing cell_group_name: Mesoderm.Mesoderm_s1, size: 61 INFO [2025-10-08 16:21:07] -processing cell_group_name: Mesoderm.Mesoderm_s4, size: 38 INFO [2025-10-08 16:21:12] -processing cell_group_name: Mesoderm.Mesoderm_s3, size: 33 INFO [2025-10-08 16:21:16] -processing cell_group_name: Mesoderm.Mesoderm_s5, size: 30 INFO [2025-10-08 16:21:21] -processing cell_group_name: Mesoderm.Mesoderm_s7, size: 20 INFO [2025-10-08 16:21:26] -processing cell_group_name: Mesoderm.Mesoderm_s6, size: 1 INFO [2025-10-08 16:21:30] -processing cell_group_name: Morula.Morula_s3, size: 61 INFO [2025-10-08 16:21:35] -processing cell_group_name: Morula.Morula_s2, size: 31 INFO [2025-10-08 16:21:39] -processing cell_group_name: Morula.Morula_s5, size: 22 INFO [2025-10-08 16:21:44] -processing cell_group_name: Morula.Morula_s6, size: 19 INFO [2025-10-08 16:21:49] -processing cell_group_name: Morula.Morula_s4, size: 17 INFO [2025-10-08 16:21:53] -processing cell_group_name: Morula.Morula_s1, size: 11 INFO [2025-10-08 16:21:58] -processing cell_group_name: Prelineage.Prelineage_s2, size: 46 INFO [2025-10-08 16:22:02] -processing cell_group_name: Prelineage.Prelineage_s4, size: 24 INFO [2025-10-08 16:22:07] -processing cell_group_name: Prelineage.Prelineage_s1, size: 15 INFO [2025-10-08 16:22:11] -processing cell_group_name: Prelineage.Prelineage_s3, size: 11 INFO [2025-10-08 16:22:16] -processing cell_group_name: Prelineage.Prelineage_s5, size: 10 INFO [2025-10-08 16:22:20] -processing cell_group_name: PriS.PriS_s1, size: 62 INFO [2025-10-08 16:22:25] -processing cell_group_name: PriS.PriS_s3, size: 60 INFO [2025-10-08 16:22:30] -processing cell_group_name: PriS.PriS_s2, size: 39 INFO [2025-10-08 16:22:35] -processing cell_group_name: PriS.PriS_s4, size: 11 INFO [2025-10-08 16:22:40] -processing cell_group_name: PriS.PriS_s5, size: 1 INFO [2025-10-08 16:22:44] -processing cell_group_name: PriS.PriS_s6, size: 1 INFO [2025-10-08 16:22:48] -processing cell_group_name: TE.TE_s11, size: 90 INFO [2025-10-08 16:22:53] -processing cell_group_name: TE.TE_s4, size: 74 INFO [2025-10-08 16:22:58] -processing cell_group_name: TE.TE_s3, size: 71 INFO [2025-10-08 16:23:04] -processing cell_group_name: TE.TE_s6, size: 70 INFO [2025-10-08 16:23:09] -processing cell_group_name: TE.TE_s13, size: 67 INFO [2025-10-08 16:23:14] -processing cell_group_name: TE.TE_s10, size: 53 INFO [2025-10-08 16:23:18] -processing cell_group_name: TE.TE_s1, size: 49 INFO [2025-10-08 16:23:23] -processing cell_group_name: TE.TE_s2, size: 47 INFO [2025-10-08 16:23:28] -processing cell_group_name: TE.TE_s15, size: 42 INFO [2025-10-08 16:23:32] -processing cell_group_name: TE.TE_s14, size: 39 INFO [2025-10-08 16:23:37] -processing cell_group_name: TE.TE_s12, size: 37 INFO [2025-10-08 16:23:42] -processing cell_group_name: TE.TE_s7, size: 25 INFO [2025-10-08 16:23:47] -processing cell_group_name: TE.TE_s8, size: 21 INFO [2025-10-08 16:23:51] -processing cell_group_name: TE.TE_s5, size: 6 INFO [2025-10-08 16:23:57] -processing cell_group_name: TE.TE_s9, size: 2 INFO [2025-10-08 16:24:03] -processing cell_group_name: YSE.YSE_s1, size: 35 INFO [2025-10-08 16:24:09] -processing cell_group_name: YSE.YSE_s2, size: 17 INFO [2025-10-08 16:24:14] -processing cell_group_name: Epiblast.Epiblast_s4, size: 74 INFO [2025-10-08 16:24:22] -processing cell_group_name: Epiblast.Epiblast_s1, size: 71 INFO [2025-10-08 16:24:27] -processing cell_group_name: Epiblast.Epiblast_s2, size: 58 INFO [2025-10-08 16:24:33] -processing cell_group_name: Epiblast.Epiblast_s3, size: 7 INFO [2025-10-08 16:24:37] -processing cell_group_name: Epiblast.Epiblast_s5, size: 1 INFO [2025-10-08 16:24:42] -writing cell clusters file: output_dir_integrated/17_HMM_predHMMi6.leiden.hmm_mode-subclusters.cell_groupings INFO [2025-10-08 16:24:42] -writing cnv regions file: output_dir_integrated/17_HMM_predHMMi6.leiden.hmm_mode-subclusters.pred_cnv_regions.dat INFO [2025-10-08 16:24:43] -writing per-gene cnv report: output_dir_integrated/17_HMM_predHMMi6.leiden.hmm_mode-subclusters.pred_cnv_genes.dat INFO [2025-10-08 16:24:44] -writing gene ordering info: output_dir_integrated/17_HMM_predHMMi6.leiden.hmm_mode-subclusters.genes_used.dat INFO [2025-10-08 16:24:56] ::plot_cnv:Start INFO [2025-10-08 16:24:56] ::plot_cnv:Current data dimensions (r,c)=13754,2363 Total=98919851 Min=2 Max=6. INFO [2025-10-08 16:24:56] ::plot_cnv:Depending on the size of the matrix this may take a moment. INFO [2025-10-08 16:24:57] plot_cnv_observation:Start INFO [2025-10-08 16:24:57] Observation data size: Cells= 2152 Genes= 13754 INFO [2025-10-08 16:24:57] plot_cnv_observation:Writing observation groupings/color. INFO [2025-10-08 16:24:57] plot_cnv_observation:Done writing observation groupings/color. INFO [2025-10-08 16:24:57] plot_cnv_observation:Writing observation heatmap thresholds. INFO [2025-10-08 16:24:57] plot_cnv_observation:Done writing observation heatmap thresholds. INFO [2025-10-08 16:25:01] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000 INFO [2025-10-08 16:25:01] Quantiles of plotted data range: 2,3,3,3,6 INFO [2025-10-08 16:25:03] plot_cnv_references:Start INFO [2025-10-08 16:25:03] Reference data size: Cells= 211 Genes= 13754 INFO [2025-10-08 16:25:04] plot_cnv_references:Number reference groups= 1 INFO [2025-10-08 16:25:04] plot_cnv_references:Plotting heatmap. INFO [2025-10-08 16:25:04] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000 INFO [2025-10-08 16:25:04] Quantiles of plotted data range: 2,3,3,3,6 INFO [2025-10-08 16:25:05] STEP 18: Run Bayesian Network Model on HMM predicted CNVs INFO [2025-10-08 16:25:05] Creating the following Directory: output_dir_integrated/BayesNetOutput.HMMi6.leiden.hmm_mode-subclusters INFO [2025-10-08 16:25:05] Initializing new MCM InferCNV Object. INFO [2025-10-08 16:25:05] validating infercnv_obj INFO [2025-10-08 16:25:06] Total CNV's: 1970 INFO [2025-10-08 16:25:06] Loading BUGS Model. INFO [2025-10-08 16:25:07] Running Sampling Using Parallel with 4 Cores INFO [2025-10-08 16:52:09] Obtaining probabilities post-sampling INFO [2025-10-08 16:53:50] Gibbs sampling time: 28.7153206825256 Minutes INFO [2025-10-08 16:56:52] ::plot_cnv:Start INFO [2025-10-08 16:56:52] ::plot_cnv:Current data dimensions (r,c)=13754,2363 Total=2423104.67580038 Min=0 Max=0.98513614063991. INFO [2025-10-08 16:56:53] ::plot_cnv:Depending on the size of the matrix this may take a moment. INFO [2025-10-08 16:56:54] plot_cnv_observation:Start INFO [2025-10-08 16:56:54] Observation data size: Cells= 2152 Genes= 13754 INFO [2025-10-08 16:56:54] plot_cnv_observation:Writing observation groupings/color. INFO [2025-10-08 16:56:54] plot_cnv_observation:Done writing observation groupings/color. INFO [2025-10-08 16:56:54] plot_cnv_observation:Writing observation heatmap thresholds. INFO [2025-10-08 16:56:54] plot_cnv_observation:Done writing observation heatmap thresholds. INFO [2025-10-08 16:56:58] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000 INFO [2025-10-08 16:56:58] Quantiles of plotted data range: 0,0,0,0,0.98513614063991 INFO [2025-10-08 16:57:00] plot_cnv_references:Start INFO [2025-10-08 16:57:00] Reference data size: Cells= 211 Genes= 13754 INFO [2025-10-08 16:57:00] plot_cnv_references:Number reference groups= 1 INFO [2025-10-08 16:57:00] plot_cnv_references:Plotting heatmap. INFO [2025-10-08 16:57:01] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000 INFO [2025-10-08 16:57:01] Quantiles of plotted data range: 0,0,0,0,0.924577526141671 INFO [2025-10-08 16:57:42] STEP 19: Filter HMM predicted CNVs based on the Bayesian Network Model results and BayesMaxPNormal INFO [2025-10-08 16:57:43] Attempting to removing CNV(s) with a probability of being normal above 0.5 INFO [2025-10-08 16:57:43] Removing 13 CNV(s) identified by the HMM. INFO [2025-10-08 16:57:43] Total CNV's after removing: 1957 INFO [2025-10-08 16:57:43] Reassigning CNVs based on state probabilities. INFO [2025-10-08 16:57:43] Changing the following CNV's states assigned by the HMM to the following based on the CNV's state probabilities. chr3-region_119 : 4 (P= 0.405983230281579 ) -> 3 (P= 0.42121403711082 ) chr3-region_124 : 5 (P= 0.283935789976073 ) -> 4 (P= 0.295544990319134 ) chr6-region_137 : 2 (P= 0.379528294059109 ) -> 3 (P= 0.481661644994921 ) chr9-region_157 : 4 (P= 0.369931945292196 ) -> 3 (P= 0.424836685215562 ) chr11-region_169 : 2 (P= 0.419406917772341 ) -> 3 (P= 0.443199452965677 ) chr12-region_180 : 5 (P= 0.276399181672451 ) -> 4 (P= 0.378758505125002 ) chr19-region_205 : 4 (P= 0.406106160305132 ) -> 3 (P= 0.449925530366546 ) chr19-region_207 : 4 (P= 0.242153484347523 ) -> 5 (P= 0.343821157537603 ) chr19-region_210 : 5 (P= 0.275678135147369 ) -> 4 (P= 0.276167504097574 ) chr19-region_211 : 4 (P= 0.252005298913088 ) -> 3 (P= 0.317173008991551 ) chr22-region_218 : 4 (P= 0.343504175108546 ) -> 3 (P= 0.448340193611606 ) chr6-region_230 : 4 (P= 0.351277129123129 ) -> 3 (P= 0.440018195789041 ) chr17-region_306 : 4 (P= 0.387268534302063 ) -> 3 (P= 0.478069281981049 ) chr3-region_326 : 4 (P= 0.413932664934364 ) -> 3 (P= 0.439601663314522 ) chr1-region_368 : 4 (P= 0.224223823387291 ) -> 3 (P= 0.338620855406641 ) chr3-region_375 : 4 (P= 0.277429685770763 ) -> 3 (P= 0.278709598875707 ) chr5-region_388 : 4 (P= 0.222472727906337 ) -> 3 (P= 0.331973157085 ) chr6-region_391 : 4 (P= 0.111074188636388 ) -> 3 (P= 0.333447313790936 ) chr7-region_398 : 4 (P= 0.110843549508546 ) -> 3 (P= 0.332679540457241 ) chr8-region_576 : 2 (P= 0.421086864596863 ) -> 3 (P= 0.424400795667482 ) chr9-region_578 : 2 (P= 0.416667734895305 ) -> 3 (P= 0.428803806317556 ) chr8-region_809 : 4 (P= 0.276314031937032 ) -> 3 (P= 0.363258251706968 ) chr15-region_865 : 4 (P= 0.423610543337225 ) -> 3 (P= 0.456322693934349 ) chr11-region_968 : 4 (P= 0.35361003417681 ) -> 3 (P= 0.432153957469345 ) chr18-region_981 : 4 (P= 0.372226939692477 ) -> 3 (P= 0.434921903878718 ) chr2-region_1046 : 4 (P= 0.274171708063671 ) -> 3 (P= 0.390686532909412 ) chr9-region_1081 : 4 (P= 0.328354445816868 ) -> 3 (P= 0.333676323763557 ) chr13-region_1093 : 4 (P= 0.332328179875219 ) -> 3 (P= 0.399724030008997 ) chr1-region_1125 : 5 (P= 0.24940539810775 ) -> 4 (P= 0.249496776373159 ) chr2-region_1132 : 5 (P= 0.239607763703251 ) -> 4 (P= 0.249324326736286 ) chr4-region_1146 : 4 (P= 0.250270238933685 ) -> 3 (P= 0.251700596709065 ) chr7-region_1164 : 4 (P= 0.2025035448253 ) -> 3 (P= 0.250834326898898 ) chr7-region_1165 : 5 (P= 0.249540216093397 ) -> 4 (P= 0.249759801782465 ) chr12-region_1187 : 4 (P= 0.127371674427561 ) -> 5 (P= 0.251262172932175 ) chr15-region_1194 : 4 (P= 0.249432694249778 ) -> 3 (P= 0.252498431405592 ) chr15-region_1196 : 4 (P= 0.248571271884588 ) -> 3 (P= 0.252238846732552 ) chr15-region_1198 : 4 (P= 0.243420412387099 ) -> 3 (P= 0.251848967904412 ) chr17-region_1208 : 4 (P= 0.247888717372483 ) -> 3 (P= 0.25420922170597 ) chr18-region_1212 : 5 (P= 0.125962796488537 ) -> 6 (P= 0.250948891243788 ) chr19-region_1216 : 2 (P= 0.247975228200793 ) -> 3 (P= 0.247994048233104 ) chr19-region_1220 : 2 (P= 0.247682021332085 ) -> 3 (P= 0.255913413320256 ) chr3-region_1245 : 2 (P= 0.147275623479186 ) -> 3 (P= 0.285495730095783 ) chr6-region_1452 : 4 (P= 0.344446706098626 ) -> 3 (P= 0.379109091877934 ) chr16-region_1473 : 4 (P= 0.413614424073976 ) -> 3 (P= 0.413849320671028 ) chr22-region_1702 : 4 (P= 0.444384538981841 ) -> 3 (P= 0.477511052213738 ) chr11-region_1726 : 4 (P= 0.440579231742526 ) -> 3 (P= 0.44182930919806 ) chr12-region_1729 : 4 (P= 0.395569632654088 ) -> 3 (P= 0.469767295036065 ) chr15-region_1777 : 2 (P= 0.400225289142585 ) -> 3 (P= 0.486251774353427 ) chr15-region_1779 : 2 (P= 0.420412401132769 ) -> 3 (P= 0.46587110090427 ) chr22-region_1803 : 4 (P= 0.400707911721438 ) -> 3 (P= 0.485746431078324 ) chr6-region_1856 : 4 (P= 0.365590485568002 ) -> 3 (P= 0.434168748129988 ) chr15-region_1928 : 4 (P= 0.429315480088322 ) -> 3 (P= 0.447228832093577 ) chr6-region_2052 : 4 (P= 0.307710168865647 ) -> 3 (P= 0.333554227854005 ) chr9-region_2108 : 2 (P= 0.346703286697403 ) -> 3 (P= 0.499045432973857 ) chr5-region_2271 : 2 (P= 0.446096635536903 ) -> 3 (P= 0.494316456670122 ) chr17-region_2311 : 5 (P= 0.353560326843123 ) -> 4 (P= 0.450810609658249 ) chr8-region_2442 : 4 (P= 0.322203153435541 ) -> 3 (P= 0.356794644860704 ) chr1-region_2577 : 2 (P= 0.391966121037366 ) -> 3 (P= 0.433625630811318 ) chr6-region_2592 : 2 (P= 0.390218033481026 ) -> 3 (P= 0.436245948275901 ) chr6-region_2596 : 2 (P= 0.389876275410445 ) -> 3 (P= 0.434527010714195 ) chr14-region_2620 : 4 (P= 0.304857035928724 ) -> 3 (P= 0.392620208854509 ) chr17-region_2633 : 4 (P= 0.348825307751822 ) -> 3 (P= 0.39068449126752 ) chr21-region_2647 : 4 (P= 0.351476189948465 ) -> 3 (P= 0.382573894188479 ) chr1-region_2653 : 5 (P= 0.232820586530421 ) -> 4 (P= 0.356116249241823 ) chr2-region_2658 : 2 (P= 0.178158856972292 ) -> 3 (P= 0.466600015776938 ) chr4-region_2665 : 4 (P= 0.292934721155979 ) -> 5 (P= 0.352484642956504 ) chr5-region_2668 : 2 (P= 0.35311736537782 ) -> 3 (P= 0.410398240882965 ) chr6-region_2672 : 4 (P= 0.29163176389675 ) -> 3 (P= 0.356466029645783 ) chr7-region_2675 : 4 (P= 0.233823672310192 ) -> 3 (P= 0.41269987834234 ) chr8-region_2678 : 4 (P= 0.294410547989494 ) -> 3 (P= 0.352666363895105 ) chr9-region_2682 : 4 (P= 0.2934583982015 ) -> 3 (P= 0.413701911328675 ) chr10-region_2685 : 4 (P= 0.188338486535807 ) -> 3 (P= 0.411602446400542 ) chr22-region_2731 : 2 (P= 0.294634384555265 ) -> 3 (P= 0.471280020694834 ) chr4-region_2744 : 5 (P= 0.335636794378098 ) -> 4 (P= 0.4892367455599 ) chr17-region_2863 : 5 (P= 0.334197416693962 ) -> 4 (P= 0.507073593138038 ) chr4-region_3016 : 2 (P= 0.374776618039341 ) -> 3 (P= 0.37498282733221 ) chr2-region_3093 : 4 (P= 0.426925967816796 ) -> 3 (P= 0.441288598087111 ) chr15-region_3138 : 4 (P= 0.441872037774394 ) -> 3 (P= 0.455724121085901 ) chr19-region_3147 : 2 (P= 0.450124896752441 ) -> 3 (P= 0.491504602634677 ) chr21-region_3155 : 4 (P= 0.425434256658998 ) -> 3 (P= 0.455582712404893 ) chr2-region_3189 : 4 (P= 0.406301636515673 ) -> 3 (P= 0.460086919245965 ) chr5-region_3200 : 4 (P= 0.419906564216863 ) -> 3 (P= 0.445966238590655 ) chr6-region_3206 : 5 (P= 0.177967386753564 ) -> 4 (P= 0.410756179784966 ) chr20-region_3235 : 2 (P= 0.421702503362424 ) -> 3 (P= 0.489959346424674 ) chr6-region_3255 : 5 (P= 0.293172778413102 ) -> 4 (P= 0.293781746807126 ) chr19-region_3399 : 2 (P= 0.146620252241022 ) -> 3 (P= 0.280165620376585 ) chr19-region_3666 : 2 (P= 0.453392828046475 ) -> 3 (P= 0.494330050845097 ) chr7-region_3824 : 2 (P= 0.433438196188996 ) -> 3 (P= 0.499011256793833 ) chr1-region_3919 : 5 (P= 0.306325511814273 ) -> 4 (P= 0.461718050549069 ) chr1-region_3922 : 4 (P= 0.395439472930336 ) -> 3 (P= 0.481017054204587 ) chr11-region_3975 : 5 (P= 0.359987228457572 ) -> 4 (P= 0.382803011826221 ) chr12-region_3979 : 2 (P= 0.455424781096251 ) -> 3 (P= 0.46869079275771 ) chr2-region_4113 : 4 (P= 0.40191261649827 ) -> 3 (P= 0.404565641338224 ) chr11-region_4294 : 2 (P= 0.370046421085695 ) -> 3 (P= 0.481497647165132 ) chr12-region_4299 : 4 (P= 0.373124908150481 ) -> 3 (P= 0.442375965034522 ) chr17-region_4308 : 4 (P= 0.333729124107909 ) -> 3 (P= 0.443888215439595 ) chr1-region_4329 : 4 (P= 0.251893205832908 ) -> 3 (P= 0.332833242411248 ) chr6-region_4341 : 4 (P= 0.254012103495276 ) -> 3 (P= 0.332562310875634 ) chr11-region_4354 : 4 (P= 0.251279012098997 ) -> 3 (P= 0.333486637881748 ) chr16-region_4369 : 4 (P= 0.332834245755072 ) -> 3 (P= 0.336156107990993 ) chr17-region_4376 : 4 (P= 0.0855598390483393 ) -> 3 (P= 0.499288246636372 ) chr12-region_4422 : 2 (P= 0.250679809506509 ) -> 3 (P= 0.252034265793059 ) chr12-region_4424 : 4 (P= 0.249852285420866 ) -> 3 (P= 0.249946742002308 ) chr21-region_4450 : 4 (P= 0.248308698260772 ) -> 3 (P= 0.253537263610524 ) chr6-region_4559 : 4 (P= 0.320469286885272 ) -> 3 (P= 0.412950398430061 ) chr2-region_4579 : 2 (P= 0.467516444291257 ) -> 3 (P= 0.480577044422975 ) chr1-region_4650 : 5 (P= 0.155501110233243 ) -> 6 (P= 0.304532414739023 ) chr1-region_4653 : 4 (P= 0.229566101255422 ) -> 3 (P= 0.460907610578825 ) chr1-region_4663 : 2 (P= 0.229938742886836 ) -> 3 (P= 0.460979894441005 ) chr2-region_4665 : 2 (P= 0.30698493367828 ) -> 3 (P= 0.38241929200582 ) chr4-region_4678 : 2 (P= 0.30571078612881 ) -> 3 (P= 0.385698317092313 ) chr8-region_4708 : 5 (P= 0.231602767793764 ) -> 6 (P= 0.306938703874863 ) chr9-region_4712 : 5 (P= 0.229262838971002 ) -> 4 (P= 0.387833854031756 ) chr16-region_4742 : 4 (P= 0.306770320841458 ) -> 5 (P= 0.307558375102484 ) chr16-region_4752 : 4 (P= 0.23426953827699 ) -> 5 (P= 0.306133744120125 ) chr17-region_4758 : 5 (P= 0.321238195979 ) -> 4 (P= 0.370316511023943 ) chr17-region_4759 : 4 (P= 0.320127532141369 ) -> 5 (P= 0.37089216983982 ) chr18-region_4761 : 2 (P= 0.308340788486555 ) -> 3 (P= 0.385002439964717 ) chr19-region_4764 : 4 (P= 0.152229888392134 ) -> 5 (P= 0.463040905048986 ) chr21-region_4777 : 4 (P= 0.232451461281039 ) -> 5 (P= 0.308402958076632 ) INFO [2025-10-08 16:57:43] Creating Plots for CNV and cell Probabilities. INFO [2025-10-08 17:09:13] ::plot_cnv:Start INFO [2025-10-08 17:09:13] ::plot_cnv:Current data dimensions (r,c)=13754,2363 Total=2405115.59252957 Min=0 Max=0.98513614063991. INFO [2025-10-08 17:09:13] ::plot_cnv:Depending on the size of the matrix this may take a moment. INFO [2025-10-08 17:09:13] plot_cnv_observation:Start INFO [2025-10-08 17:09:13] Observation data size: Cells= 2152 Genes= 13754 INFO [2025-10-08 17:09:14] plot_cnv_observation:Writing observation groupings/color. INFO [2025-10-08 17:09:14] plot_cnv_observation:Done writing observation groupings/color. INFO [2025-10-08 17:09:14] plot_cnv_observation:Writing observation heatmap thresholds. INFO [2025-10-08 17:09:14] plot_cnv_observation:Done writing observation heatmap thresholds. INFO [2025-10-08 17:09:18] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000 INFO [2025-10-08 17:09:18] Quantiles of plotted data range: 0,0,0,0,0.98513614063991 INFO [2025-10-08 17:09:20] plot_cnv_references:Start INFO [2025-10-08 17:09:20] Reference data size: Cells= 211 Genes= 13754 INFO [2025-10-08 17:09:20] plot_cnv_references:Number reference groups= 1 INFO [2025-10-08 17:09:20] plot_cnv_references:Plotting heatmap. INFO [2025-10-08 17:09:20] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000 INFO [2025-10-08 17:09:20] Quantiles of plotted data range: 0,0,0,0,0.924577526141671 INFO [2025-10-08 17:09:33] ::plot_cnv:Start INFO [2025-10-08 17:09:33] ::plot_cnv:Current data dimensions (r,c)=13754,2363 Total=98928502 Min=2 Max=6. INFO [2025-10-08 17:09:33] ::plot_cnv:Depending on the size of the matrix this may take a moment. INFO [2025-10-08 17:09:33] plot_cnv_observation:Start INFO [2025-10-08 17:09:33] Observation data size: Cells= 2152 Genes= 13754 INFO [2025-10-08 17:09:33] plot_cnv_observation:Writing observation groupings/color. INFO [2025-10-08 17:09:33] plot_cnv_observation:Done writing observation groupings/color. INFO [2025-10-08 17:09:34] plot_cnv_observation:Writing observation heatmap thresholds. INFO [2025-10-08 17:09:34] plot_cnv_observation:Done writing observation heatmap thresholds. INFO [2025-10-08 17:09:38] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000 INFO [2025-10-08 17:09:38] Quantiles of plotted data range: 2,3,3,3,6 INFO [2025-10-08 17:09:40] plot_cnv_references:Start INFO [2025-10-08 17:09:40] Reference data size: Cells= 211 Genes= 13754 INFO [2025-10-08 17:09:40] plot_cnv_references:Number reference groups= 1 INFO [2025-10-08 17:09:40] plot_cnv_references:Plotting heatmap. INFO [2025-10-08 17:09:41] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000 INFO [2025-10-08 17:09:41] Quantiles of plotted data range: 2,3,3,3,6 INFO [2025-10-08 17:09:47] STEP 20: Converting HMM-based CNV states to repr expr vals INFO [2025-10-08 17:10:00] ::plot_cnv:Start INFO [2025-10-08 17:10:00] ::plot_cnv:Current data dimensions (r,c)=13754,2363 Total=33218051 Min=0.5 Max=3. INFO [2025-10-08 17:10:00] ::plot_cnv:Depending on the size of the matrix this may take a moment. INFO [2025-10-08 17:10:00] plot_cnv_observation:Start INFO [2025-10-08 17:10:00] Observation data size: Cells= 2152 Genes= 13754 INFO [2025-10-08 17:10:01] plot_cnv_observation:Writing observation groupings/color. INFO [2025-10-08 17:10:01] plot_cnv_observation:Done writing observation groupings/color. INFO [2025-10-08 17:10:01] plot_cnv_observation:Writing observation heatmap thresholds. INFO [2025-10-08 17:10:01] plot_cnv_observation:Done writing observation heatmap thresholds. INFO [2025-10-08 17:10:05] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000 INFO [2025-10-08 17:10:05] Quantiles of plotted data range: 0.5,1,1,1,3 INFO [2025-10-08 17:10:07] plot_cnv_references:Start INFO [2025-10-08 17:10:07] Reference data size: Cells= 211 Genes= 13754 INFO [2025-10-08 17:10:08] plot_cnv_references:Number reference groups= 1 INFO [2025-10-08 17:10:08] plot_cnv_references:Plotting heatmap. INFO [2025-10-08 17:10:09] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000 INFO [2025-10-08 17:10:09] Quantiles of plotted data range: 0.5,1,1,1,3 INFO [2025-10-08 17:10:09] STEP 22: Denoising INFO [2025-10-08 17:10:09] ::process_data:Remove noise, noise threshold defined via ref mean sd_amplifier: 1.5 INFO [2025-10-08 17:10:09] denoising using mean(normal) +- sd_amplifier * sd(normal) per gene per cell across all data INFO [2025-10-08 17:10:09] :: **** clear_noise_via_ref_quantiles **** : removing noise between bounds: 0.791835523440265 - 1.22959436441825 INFO [2025-10-08 17:10:33] ## Making the final infercnv heatmap ## INFO [2025-10-08 17:10:34] ::plot_cnv:Start INFO [2025-10-08 17:10:34] ::plot_cnv:Current data dimensions (r,c)=13754,2363 Total=33244923.5683827 Min=0.202126164889467 Max=3.31282454939968. INFO [2025-10-08 17:10:34] ::plot_cnv:Depending on the size of the matrix this may take a moment. INFO [2025-10-08 17:10:35] plot_cnv(): auto thresholding at: (0.527870 , 1.472130) INFO [2025-10-08 17:10:36] plot_cnv_observation:Start INFO [2025-10-08 17:10:36] Observation data size: Cells= 2152 Genes= 13754 INFO [2025-10-08 17:10:36] plot_cnv_observation:Writing observation groupings/color. INFO [2025-10-08 17:10:36] plot_cnv_observation:Done writing observation groupings/color. INFO [2025-10-08 17:10:36] plot_cnv_observation:Writing observation heatmap thresholds. INFO [2025-10-08 17:10:36] plot_cnv_observation:Done writing observation heatmap thresholds. INFO [2025-10-08 17:10:41] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000 INFO [2025-10-08 17:10:41] Quantiles of plotted data range: 0.527870465304278,1.01071494392926,1.01071494392926,1.01071494392926,1.47212953469572 INFO [2025-10-08 17:10:43] plot_cnv_references:Start INFO [2025-10-08 17:10:43] Reference data size: Cells= 211 Genes= 13754 INFO [2025-10-08 17:10:44] plot_cnv_references:Number reference groups= 1 INFO [2025-10-08 17:10:44] plot_cnv_references:Plotting heatmap. INFO [2025-10-08 17:10:44] Colors for breaks: #00008B,#24249B,#4848AB,#6D6DBC,#9191CC,#B6B6DD,#DADAEE,#FFFFFF,#EEDADA,#DDB6B6,#CC9191,#BC6D6D,#AB4848,#9B2424,#8B0000 INFO [2025-10-08 17:10:44] Quantiles of plotted data range: 0.527870465304278,1.01071494392926,1.01071494392926,1.01071494392926,1.47212953469572
Warning: Data is of class matrix. Coercing to dgCMatrix. Finding variable features for layer counts Calculating gene variances 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating feature variances of standardized and clipped values 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Centering and scaling data matrix | | | 0% | |======= | 10% | |============== | 20% | |===================== | 30% | |============================ | 40% | |=================================== | 50% | |========================================== | 60% | |================================================= | 70% | |======================================================== | 80% | |=============================================================== | 90% | |======================================================================| 100% PC_ 1 Positive: CAPN10, RNPEPL1, KIF1A, DUSP28, ANKMY1, PASK, GPC1, NDUFA10, PPP1R7, HDAC4 ANO7, ASB1, TRAF3IP1, KIAA0930, FAM118A, SMC1B, NUP50, RIBC2, PER2, FBLN1 HDLBP, ATXN10, PHF21B, WNT7B, PPARA, ARHGAP8, CDPF1, TTC38, HES6, GTSE1 Negative: ZNF610, ZNF480, ZNF766, PPP2R1A, ZNF836, GABBR1, ZNF616, MOG, ZNF841, ZNF432 ZFP57, ZNF614, HLA-F, ZNF615, ZSCAN4, ZNF551, ZNF154, ZNF211, HLA-G, ZNF134 ZNF350, ZNF256, ZNF613, HLA-A, C19orf18, ZNF44, ZNF563, ZNF606, ZNF136, ZNF442 PC_ 2 Positive: DTL, INTS7, PPP2R5A, TMEM206, LPGAT1, NENF, NEK2, SLC30A1, ATF3, LINC00467 BATF3, TRAF5, RCOR3, KCNH1, NSL1, HHAT, SERTAD4, TATDN3, SYT14, FLVCR1 IRF6, C1orf74, LAMB3, TRAF3IP3, G0S2, PLXNA2, CD34, VASH2, CD46, CR1L Negative: KLK7, KLK6, KLK8, KLK2, C19orf48, KLK10, GPR32, CLEC11A, KLK11, C19orf81 JOSD2, KLK13, EMC10, CTU1, FAM71E1, VSIG10L, SPIB, ETFB, LIM2, POLD1 SIGLEC10, NAPSA, ZNF175, NR1H2, ZNF577, KCNC3, ZNF649, ZNF613, ZNF473, ZNF350 PC_ 3 Positive: EGFLAM, LIFR, LIFR-AS1, OSMR, RICTOR, DAB2, PTGER4, TTC33, PRKAA1, RPL37 OXCT1, C5orf51, FBXO4, GHR, CCDC152, ANXA2R, ZNF131, HMGCS1, CCL28, C5orf34 PAIP1, NNT-AS1, NNT, MRPS30, HCN1, EMB, TIGIT, ZDHHC23, GRAMD1C, ATP6V1A Negative: EML2, GPR4, GIPR, OPA3, SNRPD2, VASP, PPM1N, QPCTL, FOSB, RTN2 FBXO46, SIX5, DMPK, ERCC1, DMWD, SYMPK, FOXA3, CD3EAP, IRF2BP1, MYPOP GDF15, PGPEP1, SSBP4, PPP1R13L, LSM4, CCDC61, ISYNA1, PPP5C, CCDC8, HIF3A PC_ 4 Positive: IFI6, FAM76A, STX12, PPP1R8, THEMIS2, RPA2, SMPDL3B, SFPQ, ZMYM4, KIAA0319L NCDN, ZMYM1, XKR8, PSMB2, C1orf216, ZMYM6, CLSPN, DLGAP3, AGO4, AGO1 GJA4, EYA3, SH3D21, AGO3, EVA1B, GJB3, THRAP3, TEKT2, STK40, ADPRHL2 Negative: BMPR2, FAM117B, ICA1L, NOP58, SUMO1, AC079354.3, WDR12, FZD7, CARF, ALS2 NBEAL1, TMEM237, STRADB, CYP20A1, TRAK2, CASP8, ABI2, CASP10, RAPH1, PARD3B CFLAR-AS1, NRP2, CFLAR, INO80D, NDUFB3, AC007383.3, FAM126B, NDUFS1, ORC2, NIF3L1 PC_ 5 Positive: HIGD2A, CLTB, FAF2, RNF44, GPRIN1, SNCB, EIF4E1B, TSPAN17, UIMC1, ZNF346 FGFR4, NSD1, RAB24, MXD3, PRELID1, LMAN2, RGS14, F12, GRK6, PCDHGA12 PRR7, PCDHGA10, DBN1, PCDHGA2, PDLIM7, TAF7, DOK3, PCDHB5, PCDHB3, DDX41 Negative: LRFN5, C14orf28, FBXO33, MIA2, KLHL28, PNN, PRPF39, TRAPPC6B, FKBP3, FANCM MIS18BP1, RPL10L, GEMIN2, MDGA2, KIAA0319L, ZMYM4, LINC00648, RPS29, CLSPN, NCDN AGO4, C1orf216, AL139099.1, PSMB2, SEC23A, SFPQ, AGO1, ZMYM1, MIPOL1, ZMYM6 Computing nearest neighbor graph Computing SNN Warning: Data is of class matrix. Coercing to dgCMatrix. Finding variable features for layer counts Calculating gene variances 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating feature variances of standardized and clipped values 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Centering and scaling data matrix | | | 0% | |======= | 10% | |============== | 20% | |===================== | 30% | |============================ | 40% | |=================================== | 50% | |========================================== | 60% | |================================================= | 70% | |======================================================== | 80% | |=============================================================== | 90% | |======================================================================| 100% PC_ 1 Positive: IL11RA, DNAJB5, GALT, VCP, SIGMAR1, FANCG, DCTN3, PIGO, RPP25L, CNTFR STOML2, ENHO, FAM214B, DNAI1, UNC13B, FAM219A, RUSC2, C9orf24, NUDT2, TESK1 KIF24, CD72, UBAP1, RMRP, DCAF12, UBAP2, CCDC107, UBE2R2, ARHGEF39, DLGAP4 Negative: ENY2, NUDCD1, PKHD1L1, TMEM74, EBAG9, SYBU, EMC2, KCNV1, TRPS1, EIF3E EIF3H, UTP23, RSPO2, RAD21, SLC30A8, ANGPT1, MED30, EXT1, SAMD12, OXR1 TNFRSF11B, ZFPM2, COLEC10, MAL2, LRP12, ENPP2, DPYS, TAF2, RIMS2, DSCC1 PC_ 2 Positive: CTU1, VSIG10L, ETFB, LIM2, SIGLEC10, ZNF175, ZNF577, ZNF649, ZNF613, ZNF350 ZNF615, ZNF614, ZNF432, ZNF841, ZNF616, ZNF836, ZNF324, ZNF446, PPP2R1A, ZNF766 ZNF480, ZNF610, SLC25A1, DGCR2, USP18, TUBA8, PEX26, MICAL3, CLCF1, RPS6KB2 Negative: ZNF644, ZNF326, CDC7, TGFBR3, LRRC8D, BRDT, LRRC8C, EPHX4, LRRC8B, BTBD8 C1orf146, GBP4, GLMN, GBP2, RPAP2, GBP1, GFI1, RBMXL1, EVI5, RPL5 GTF2B, FAM69A, MTF2, CCDC18, TMED5, PKN2, DR1, FNBP1L, LMO4, BCAR3 PC_ 3 Positive: AZI2, CMC1, RBMS3, EOMES, TGFBR2, SLC4A7, STT3B, LRRC3B, OSBPL10, OXSM ZNF860, NGLY1, GPD1L, TOP2B, RARB, THRB, CMTM8, NR1D2, RPL15, CMTM7 CMTM6, NKIRAS1, DYNC1LI1, UBE2E1, CNOT10, UBE2E2, TRIM71, ZNF385D-AS1, GLB1, KAT2B Negative: RNLS, PTEN, STAMBPL1, KLLN, ATAD1, PAPSS2, ACTA2, MINPP1, NUTM2A-AS1, LIPA GLUD1, IFIT3, IFIT1, SNCG, IFIT5, MMRN2, PANK1, BMPR1A, KIF20B, RPP30 CCSER2, PCGF5, ANKRD1, HECTD2, PPP1R3C, LINC00858, TNKS2, CDHR1, BTAF1, CPEB3 PC_ 4 Positive: STXBP6, NOVA1, SDR39U1, FOXG1, PRKD1, KHNYN, NYNRIN, G2E3, LTB4R, SCFD1 LTB4R2, COCH, CIDEB, NOP9, DHRS1, RABGGTA, TGM1, TINF2, GMPR2, NEDD8 NEDD8-MDP1, MDP1, CHMP4A, NUP43, LATS1, PCMT1, LRP11, GINM1, KATNA1, PPP1R14C Negative: NDUFB5, USP13, MRPL47, ACTL6A, TTC14, STRA6, ISLR, CCDC33, CCDC39, CYP11A1 PML, SEMA7A, FXR1, UBL7, STOML1, UBL7-AS1, ARID3B, DNAJC19, LOXL1, CLK3 EDC3, SOX2, LOXL1-AS1, ATP8B1, NARS, CYP1A1, NEDD4L, FECH, LINC-ROR, CSK PC_ 5 Positive: ARHGAP19-SLIT1, LCOR, PIK3AP1, TM9SF3, ARHGAP19, FRAT1, TLL2, FRAT2, ZNF518A, CCNJ RRP12, CC2D2B, PGAM1, ENTPD1-AS1, EXOSC1, ENTPD1, ZDHHC16, TCTN3, MMS19, ALDH18A1 UBTD1, SORBS1, ANKRD2, PDLIM1, PI4K2A, CYP2C18, MORN4, EFNA4, EFNA3, ADAM15 Negative: BRF2, RAB11FIP1, EIF4EBP1, ASH2L, STAR, ERLIN2, LETM2, FGFR1, DDHD2, ZNF703 LSM1, BAG4, TACC1, UNC5D, PLEKHA2, RNF122, MAK16, HTRA4, TTI2, FUT10 NRG1, WRN, TM2D2, TEX15, PPP2CB, ADAM9, UBXN8, ADAM32, GSR, ADAM18 Computing nearest neighbor graph Computing SNN Warning: Data is of class matrix. Coercing to dgCMatrix. Finding variable features for layer counts Calculating gene variances 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating feature variances of standardized and clipped values 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Centering and scaling data matrix | | | 0% | |======= | 10% | |============== | 20% | |===================== | 30% | |============================ | 40% | |=================================== | 50% | |========================================== | 60% | |================================================= | 70% | |======================================================== | 80% | |=============================================================== | 90% | |======================================================================| 100% PC_ 1 Positive: VCAN, XRCC4, TMEM167A, FBXO4, C5orf51, GHR, CCL28, CCDC152, ANXA2R, ZNF131 HMGCS1, C5orf34, ATP6AP1L, OXCT1, PAIP1, RPL37, NNT-AS1, OSMR, LIFR-AS1, RPS23 LIFR, PRKAA1, RICTOR, DAB2, PTGER4, TTC33, NNT, EGFLAM, WDR70, MRPS30 Negative: HDAC5, G6PC3, C17orf53, LSM12, SSBP4, GDF15, ISYNA1, TMEM101, PGPEP1, LSM4 JUND, PDE4C, RAB3A, MPV17L2, NAGS, IFI30, PYY, PIK3R2, MAST3, PPY MPP2, ARRDC2, MPP3, KCNN1, DUSP3, CCDC124, ETV4, SLC5A5, DHX8, RPL18A PC_ 2 Positive: INPP5A, NKX6-2, UTF1, VENTX, ADAM8, PWWP2B, TUBGCP2, ZNF511, LRRC27, PRAP1 STK32C, DPYSL4, FUOM, BNIP3, ECHS1, PPP2R2D, PAOX, TCERG1L, GLRX3, MTG1 MGMT, DCP1A, CACNA1D, MKI67, ERC2, CHDH, IL17RB, CCDC66, TKT, PTPRE Negative: HMGA1, NUDT3, GRM4, RPS10-NUDT3, LEMD2, RPS10, PACSIN1, UQCC2, C6orf106, SNRPC ITPR3, UHRF1BP1, BAK1, TAF11, ANKS1A, TCP11, ZBTB9, SCUBE3, CUTA, PHF1 KIFC1, DAXX, ZBTB22, TAPBP, RGL2, PFDN6, WDR46, MICAL3, PEX26, TUBA8 PC_ 3 Positive: ACOX2, KCTD6, PDHB, C3orf67, FHIT, PXK, PTPRG, RPP14, ABHD6, DNASE1L3 FLNB-AS1, FLNB, SLMAP, DENND6A, AC022400.2, ARF4, AC022400.1, SEC24C, USP54, MYOZ1 PPP3CB, MSS51, ANXA7, PDE12, DNAJC9-AS1, NEU1, MRPS16, SLC44A4, C6orf48, EHMT2 Negative: ENTPD1-AS1, CC2D2B, ALDH18A1, CCNJ, SORBS1, ZNF518A, PDLIM1, CYP2C18, HELLS, TLL2 TBC1D12, NOC3L, TM9SF3, PLCE1, SLC35G1, PIK3AP1, LGI1, FRA10AC1, LCOR, PDE6C RBP4, CEP55, MYOF, ARHGAP19-SLIT1, CYP26A1, EXOC6, ARHGAP19, HHEX, FRAT1, KIF11 PC_ 4 Positive: AQR, ACTC1, GOLGA8B, ZNF770, GOLGA8A, NANOGP8, DPH6, C15orf41, LPCAT4, MEIS2 SPRED1, NUTM1, FAM98B, NOP10, RASGRP1, SLC12A6, THBS1, EMC4, FSIP1, KATNBL1 GPR176, PGBD4, EIF2AK4, SRP14, EMC7, SRP14-AS1, BMF, FRMD6, TMX1, GNG2 Negative: IPO7, ZNF143, TMEM41B, SWAP70, WEE1, SBF2-AS1, DENND5A, SCUBE2, SBF2, NRIP3 TMEM9B-AS1, ADM, AMPD3, TMEM9B, C11orf16, MTRNR2L8, RPL27A, AKIP1, ST5, TRIM66 STK33, RNF141, LMO1, RIC3, TUB, MRVI1-AS1, LYVE1, MRVI1, CTR9, EIF4G2 PC_ 5 Positive: PITRM1, PFKP, PITRM1-AS1, ZMYND11, KLF6, AKR1E2, LINC00200, DIP2C, IDI1, IDI2-AS1 WDR37, CDK4, PRR26, GTPBP4, MARCH9, AKR1C3, TSPAN31, AGAP2-AS1, PIP4K2C, B4GALNT1 LARP4B, OS9, CYP27B1, KIF5A, DCTN2, MBD6, TUBAL3, METTL1, DDIT3, NET1 Negative: TMEM189-UBE2V1, TMEM189, SNAI1, UBE2V1, RNF114, SPATA2, SLC9A8, CEBPB, B4GALT5, PTGIS PTPN1, KCNB1, ZFAS1, PARD6B, BCAS4, ADNP, DPM1, MOCS3, ZNFX1, KCNG1 NFATC2, DDX27, ATP9A, STAU1, SALL4, ZFP64, ZNF217, BCAS1, CSE1L, PFDN4 Computing nearest neighbor graph Computing SNN Warning: Data is of class matrix. Coercing to dgCMatrix. Finding variable features for layer counts Calculating gene variances 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating feature variances of standardized and clipped values 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Centering and scaling data matrix | | | 0% | |======= | 10% | |============== | 20% | |===================== | 30% | |============================ | 40% | |=================================== | 50% | |========================================== | 60% | |================================================= | 70% | |======================================================== | 80% | |=============================================================== | 90% | |======================================================================| 100% PC_ 1 Positive: PSD4, IL1RN, IL36RN, PAX8, IL1A, CKAP2L, CBWD2, RABL2A, SLC20A1, CHCHD5 POLR1B, SLC35F5, TTL, RGPD8, ACTR3, ZC3H6, ZC3H8, DPP10, FBLN7, TMEM87B DDX18, MERTK, ANAPC1, CCDC93, BCL2L11, ACOXL, INSIG2, BUB1, NPHP1, EN1 Negative: ARFGAP1, KCNQ2, NKAIN4, EEF1A2, YTHDF1, TCFL5, GID8, COL9A3, DIDO1, PPDPF GMEB2, STMN3, RTEL1, HDGF, RTEL1-TNFRSF6B, PRCC, ARHGEF11, ETV3, CD1D, IFI16 CRP, DUSP23, NDUFS8, TCIRG1, UNC93B1, ALDH3B2, CHKA, LRP5, C11orf24, ACY3 PC_ 2 Positive: HAUS8, MYO9B, CPAMD8, USE1, SIN3B, TMEM38A, SMIM7, OCEL1, NR2F6, MED26 USHBP1, BABAM1, SLC35E1, ANKLE1, CHERP, ABHD8, C19orf44, MRPL34, EPS15L1, DDA1 KLF2, GTPBP3, AP1M1, PLVAP, FAM32A, BST2, RAB8A, MVB12A, TPM4, SLC27A1 Negative: HAX1, UBAP2L, ATP8B2, C1orf43, TPM3, IL6R, RPS27, RAB13, SHE, JTB UBE2Q1, CREB3L4, SLC39A1, ADAR, CRTC2, DENND4B, PMVK, GATAD2B, PBXIP1, SLC27A3 PYGO2, INTS3, SERPINH1, MAP6, DGAT2, GDPD5, UVRAG, KLHL35, WNT11, RPS3 PC_ 3 Positive: PCDH18, PCDH10, C4orf33, TRA2A, CCDC126, IGF2BP3, FAM221A, SCLT1, STK31, PGRMC2 NPY, MALSU1, LARP1B, GPNMB, MPP6, MFSD8, NUPL2, PLK4, HSPA4L, OSBPL3 SLC25A31, KLHL7, INTU, ANKRD50, SPRY1, SPATA5, NUDT6, ADAD1, BBS12, FGF2 Negative: LHPP, FAM53B, ZRANB1, CTBP2, NKX1-2, OAT, UROS, CHST15, BCCIP, CPXM2 DHX32, FANK1, ADAM12, BUB3, C10orf90, ACADSB, IKZF5, DOCK1, FAM196A, PTPRE PSTK, MKI67, MGMT, GLRX3, C10orf88, TCERG1L, FAM24B, PPP2R2D, CUZD1, BNIP3 PC_ 4 Positive: C16orf72, GRIN2A, ATF7IP2, EMP2, TEKT5, NUBP1, TVP23A, CIITA, DEXI, IPO7 TMEM41B, ZNF143, CLEC16A, DENND5A, WEE1, SWAP70, SCUBE2, NRIP3, TMEM9B-AS1, SBF2-AS1 RMI2, TMEM9B, C11orf16, SOCS1, SBF2, LITAF, ADM, SNN, AMPD3, TXNDC11 Negative: KLHL28, C14orf28, PRPF39, LRFN5, FKBP3, FBXO33, FANCM, MIA2, MIS18BP1, PNN RPL10L, TRAPPC6B, GEMIN2, MDGA2, LINC00648, SEC23A, RPS29, AL139099.1, MIPOL1, LRR1 RPL36AL, MGAT2, DNAAF2, SLC25A21, POLE2, KLHDC2, NEMF, WDR92, AL627171.1, C1D PC_ 5 Positive: POLD3, PGM2L1, LIPT2, KCNE3, SPCS2, RNF169, XRRA1, NEU3, P4HA3, SLCO2B1 ARRB1, PPME1, RPS3, C2CD3, KLHL35, UCP2, GDPD5, DNAJB13, SERPINH1, PAAF1 MAP6, COA4, DGAT2, MRPL48, UVRAG, RAB6A, WNT11, LRRC32, PLEKHB1, TSKU Negative: TSN, CLASP1, TFCP2L1, RALB, TMEM185B, EPB41L5, PTPN4, TMEM177, TMEM37, DBI C2orf76, STEAP3, EN1, INSIG2, CCDC93, DDX18, ARID3B, UBL7-AS1, UBL7, SEMA7A CYP11A1, CCDC33, STRA6, CLK3, DPP10, ISLR, EDC3, PML, STOML1, CYP1A1 Computing nearest neighbor graph Computing SNN Warning: Data is of class matrix. Coercing to dgCMatrix. Finding variable features for layer counts Calculating gene variances 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating feature variances of standardized and clipped values 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Centering and scaling data matrix | | | 0% | |======= | 10% | |============== | 20% | |===================== | 30% | |============================ | 40% | |=================================== | 50% | |========================================== | 60% | |================================================= | 70% | |======================================================== | 80% | |=============================================================== | 90% | |======================================================================| 100% PC_ 1 Positive: ZNF280D, TCF12, LINC00926, MNS1, CGNL1, TEX9, MYZAP, POLR2M, RFX7, ALDH1A2 NEDD4, PRTG, LIPC, PYGO1, ADAM10, RNF111, SLTM, CCNB2, MYO1E, FAM81A GCNT3, GTF2A2, BNIP2, ANXA2, RORA, VPS13C, TLN2, AC103740.1, TPM1, LACTB Negative: ILVBL, SLC1A6, OR7C1, NOTCH3, ZNF333, EPHX3, NDUFB7, BRD4, TECR, AKAP8 DNAJB1, AKAP8L, GIPC1, WIZ, UCA1, TPM4, PKN1, RAB8A, TMEM199, POLDIP2 FAM32A, VTN, AP1M1, DDX39A, SARM1, KLF2, EPS15L1, SLC46A1, C19orf44, KXD1 PC_ 2 Positive: RIN3, LGMN, CPSF2, UBTD1, NDUFB1, MMS19, GOLGA5, ATXN3, ZDHHC16, ANKRD2 EXOSC1, PI4K2A, TRIP11, PGAM1, CHGA, MORN4, RRP12, FBLN5, AVPI1, ITPK1 FRAT2, TC2N, MARVELD1, ITPK1-AS1, CCDC88C, ZFYVE27, FRAT1, MOAP1, RPS6KA5, TTC7B Negative: ZDHHC11, ZDHHC11B, AC026740.1, BRD9, TPPP, TRIP13, AC116351.1, NUDT2, CEP72, NKD2 KIF24, SLC9A3, SLC12A7, EXOC3, UBAP1, SLC6A19, AHRR, TERT, DCAF12, PDCD6 CLPTM1L, XPOT, UBAP2, C12orf56, TBK1, RASSF3, SLC6A3, C12orf66, GNS, UBE2R2 PC_ 3 Positive: TTC14, USP13, CCDC39, FXR1, NDUFB5, MRPL47, DNAJC19, ACTL6A, SOX2, GNB4 MFN1, ATP11B, ZNF639, KCNMB3, DCUN1D1, PIK3CA, MCCC1, ZMAT3, LAMP3, LINC00501 MCF2L2, TBL1XR1, B3GNT5, NAALADL2, KLHL24, YEATS2, NLGN1, FAM72A, CTSE, AC131160.1 Negative: PHACTR1, EDN1, HIVEP1, TBC1D7, ADTRP, GFOD1, NEDD9, ERVFRD-1, SMIM13, ELOVL2 GCM2, MAK, SYCP2L, TMEM14B, TMEM14C, PAK1IP1, C6orf52, GCNT2, TFAP2A, SLC35B3 EEF1E1, EEF1E1-BLOC1S5, BLOC1S5, BLOC1S5-TXNDC5, TXNDC5, BMP6, SNRNP48, DSP, RIOK1, SSR1 PC_ 4 Positive: LRFN4, SYT12, RHOD, PC, KDM2A, ANKRD13D, RCE1, SSH3, C11orf80, POLD4 SPTBN2, CLCF1, RBM4B, RAD9A, RBM4, PPP1CA, RBM14-RBM4, RPS6KB2, RBM14, PTPRCAP CCS, CORO1B, CCDC87, TMEM134, CTSF, AIP, ACTN3, PITPNM1, ZDHHC24, CDK2AP2 Negative: FKBP6, TRIM50, NSUN5, POM121, TYW1B, CALN1, AUTS2, TYW1, SBDS, TMEM248 RABGEF1, CDK17, ELK3, NEDD1, TMPO-AS1, LTA4H, KCTD7, TMPO, HAL, SLC25A3 AMDHD1, TPST1, IKBIP, APAF1, SNRPF, CRCP, ANKS1B, NTN4, UHRF1BP1L, USP44 PC_ 5 Positive: MAP3K4, AGPAT4, QKI, SFT2D1, PLG, MPC1, RPS6KA2, SLC22A3, RNASET2, FGFR1OP IGF2R, UNC93A, PNLDC1, MRPL18, DACT2, TCP1, ACAT2, SMOC2, WTAP, THBS2 WDR27, SOD2, C6orf120, RSPH3, PHF10, C6orf99, EZR, CDT1, PIEZO1, SYTL3 Negative: S100PBP, KIAA1522, YARS, FNDC5, TMEM54, RNF19B, AK2, TRIM62, ZNF362, AL513327.1 PHC2, ZSCAN20, SMIM12, GJB5, GJB3, GJA4, DLGAP3, ZMYM6, ZMYM1, SFPQ PSMB2, ZMYM4, C1orf216, NCDN, CLSPN, KIAA0319L, AGO4, AGO1, AGO3, ZFP57 Computing nearest neighbor graph Computing SNN Warning: Data is of class matrix. Coercing to dgCMatrix. Finding variable features for layer counts Calculating gene variances 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating feature variances of standardized and clipped values 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Centering and scaling data matrix | | | 0% | |======= | 10% | |============== | 20% | |===================== | 30% | |============================ | 40% | |=================================== | 50% | |========================================== | 60% | |================================================= | 70% | |======================================================== | 80% | |=============================================================== | 90% | |======================================================================| 100% PC_ 1 Positive: GMFB, CNIH1, CGRRF1, SAMD4A, CDKN3, GCH1, BMP4, WDHD1, DDHD1, SOCS4 MAPK1IP1L, FERMT2, GNPNAT1, LGALS3, STYX, DLGAP5, PSMC6, FBXO34, FANCL, ATG14 GPR137C, VRK2, TBPL2, EFEMP1, TXNDC16, KTN1-AS1, PNPT1, NID2, CCDC88A, GNG2 Negative: LRFN4, PC, SYT12, RCE1, C11orf80, RHOD, SPTBN2, KDM2A, ANKRD13D, RBM4B RBM4, SSH3, RBM14-RBM4, POLD4, RBM14, CLCF1, CCS, CCDC87, RAD9A, CTSF ACTN3, ZDHHC24, PPP1CA, BBS1, DPP3, RPS6KB2, PELI3, MRPL11, PTPRCAP, SLC29A2 PC_ 2 Positive: RPS5, AC012313.1, ZNF584, ZNF132, ZNF324B, ZNF324, ZNF446, NOP10, NUTM1, LPCAT4 SLC12A6, GOLGA8A, EMC4, KATNBL1, PGBD4, EMC7, FSIP1, GPR176, DEK, EIF2AK4 SRP14, SRP14-AS1, BMF, BUB1B, PAK6, PLCB2, ZNF530, KNSTRN, ZIK1, ZNF416 Negative: SLC35C2, CD40, ELMO2, NCOA5, ZNF334, SLC12A5, MMP9, SLC13A3, ZNF335, TP53RK PCIF1, SLC2A10, PLTP, EYA2, CTSA, ZMYND8, NEURL2, AL031666.2, ZSWIM1, NCOA3 SULF2, ZSWIM3, PREX1, ACOT8, ARFGEF2, SNX21, CSE1L, TNNC2, STAU1, UBE2C PC_ 3 Positive: NR1H3, MADD, SLC39A13, ACP2, PSMC3, DDB2, PACSIN3, CELF1, NDUFS3, ARFGAP2 C11orf49, PTPMT1, LRP4, CKAP5, KBTBD4, F2, ZNF408, C1QTNF4, ARHGAP1, ATG13 MTCH2, EIF3I, MARCKSL1, LCK, HDAC1, TMEM234, BSDC1, IQCC, FAM229A, SDHC Negative: ATRNL1, GFRA1, TRUB1, C10orf82, HSPA12A, FAM160B1, ENO4, ABLIM1, SLC18A2, AFAP1L2 PDZD8, TDRD1, RAB11FIP2, NHLRC2, CASC2, NMD3, B3GALNT1, DCLRE1A, PPM1L, SPTSSB FAM204A, BCHE, KPNA4, TRIM59, ZBBX, SMC4, PDCD10, IFT80, CASP7, PRLHR PC_ 4 Positive: DCLRE1A, CASP7, NHLRC2, NRAP, TDRD1, HABP2, TCF7L2, AFAP1L2, ABLIM1, VTI1A ZDHHC6, FAM160B1, ACSL5, TRUB1, ATRNL1, GPAM, GFRA1, SHOC2, C10orf82, HSPA12A BBIP1, ENO4, SLC18A2, PDCD4, PDZD8, RAB11FIP2, CASC2, SMC3, FAM204A, PRLHR Negative: HBP1, PRKAR2B, COG5, CCDC71L, NAMPT, GPR22, SYPL1, CDHR3, ATXN7L1, DUS4L EFCAB10, RINT1, PUS7, BCAP29, SRPK2, CBLL1, KMT2E, KMT2E-AS1, LINC01004, LHFPL3 ORC5, DLD, RELN, PSMC2, LAMB1, DNAJC2, PMPCB, NRCAM, NAPEPLD, ARMC10 PC_ 5 Positive: LAMA4, FAM229B, RFPL4B, MARCKS, TUBE1, FYN, HDAC2, FRK, TRAF3IP2, DSE TSPYL4, TSPYL1, NT5DC1, RWDD1, RSPH4A, TRAF3IP2-AS1, KPNA5, FAM162B, REV3L, ROS1 GOPC, DCBLD1, SLC16A10, NUS1, SLC35F1, CEP85L, RPF2, MCM9, GTF3C6, AMD1 Negative: PNLDC1, PLG, SLC22A3, IGF2R, MAP3K4, MRPL18, AGPAT4, TCP1, QKI, ACAT2 SFT2D1, WTAP, SOD2, RSPH3, MPC1, RPS6KA2, C6orf99, RNASET2, EZR, SYTL3 FGFR1OP, DYNLT1, UNC93A, TMEM181, DACT2, TULP4, SMOC2, GTF2H5, THBS2, SERAC1 Computing nearest neighbor graph Computing SNN Warning: Data is of class matrix. Coercing to dgCMatrix. Finding variable features for layer counts Calculating gene variances 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating feature variances of standardized and clipped values 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Centering and scaling data matrix | | | 0% | |======= | 10% | |============== | 20% | |===================== | 30% | |============================ | 40% | |=================================== | 50% | |========================================== | 60% | |================================================= | 70% | |======================================================== | 80% | |=============================================================== | 90% | |======================================================================| 100% PC_ 1 Positive: PITPNM1, AIP, CDK2AP2, TMEM134, GSTP1, CORO1B, PTPRCAP, NDUFV1, RPS6KB2, NUDT8 PPP1CA, ACY3, RAD9A, CLCF1, ALDH3B2, POLD4, SSH3, UNC93B1, ANKRD13D, KDM2A RHOD, NDUFS8, SYT12, LRFN4, TCIRG1, PC, RCE1, C11orf80, CHKA, SPTBN2 Negative: TBPL2, KTN1-AS1, ATG14, FBXO34, DLGAP5, LGALS3, MAPK1IP1L, SOCS4, WDHD1, GCH1 FERMT2, DDHD1, SAMD4A, BMP4, GNPNAT1, CGRRF1, GMFB, CNIH1, CDKN3, STYX PSMC6, GPR137C, TXNDC16, NID2, GNG2, FRMD6, TMX1, ABHD12B, PYGL, NIN PC_ 2 Positive: IMPA2, TUBB6, MPPE1, RAB31, CHMP1B, VAPA, GNAL, NAPG, AFG3L2, PPP4R1 SPIRE1, RALBP1, AP005482.1, TWSG1, PSMG2, ANKRD12, CEP76, NDUFV2, RAB12, PTPN2 PTPRM, SEH1L, LAMA1, CEP192, LINC00668, LDLRAD4, ARHGAP28, TMEM200C, FAM210A, EPB41L3 Negative: U2AF2, CCDC106, EPN1, NLRP9, ZNF865, ZNF784, ZNF581, ZNF580, RFPL4A, ZNF524 RFPL4AL1, FIZ1, NLRP11, NAT14, NLRP4, ZNF628, ISOC2, NLRP13, UBE2S, NLRP8 RPL28, NLRP5, COX6B2, TMEM150B, ZNF787, HSPBP1, ZNF444, PPP6R1, TMEM86B, ZSCAN5B PC_ 3 Positive: ZCCHC14, JPH3, AC010536.1, KLHDC4, SLC7A5, BANP, ZFPM1, ZC3H18, CYBA, MVD SNAI3-AS1, SNAI3, RNF166, CTU2, PIEZO1, CDT1, MRPS31, FOXO1, COG6, SYCE1 NHLRC3, CYP2E1, SPRN, MTG1, PAOX, ECHS1, FUOM, PRAP1, ZNF511, TUBGCP2 Negative: DEPDC5, PISD, YWHAH, PRR14L, RFPL2, SFI1, RFPL3S, RTCB, EIF4ENIF1, FBXO7 DRG1, SYN3, PATZ1, TIMP3, PIK3IP1, HMGXB4, LIMK2, RNF185, TOM1, PLA2G3 INPP5J, HMOX1, SMTN, PDLIM7, DBN1, MORC2, MORC2-AS1, DOK3, PRR7, OSBP2 PC_ 4 Positive: RGS7, GREM2, FMN2, CHRM3, ZP4, RYR2, MTR, ACTN2, HEATR1, LGALS8 EDARADD, GPR137B, NID1, LYST, GNG4, B3GALNT2, TBCE, GGPS1, ARID4B, RBM34 C6orf203, OVCH1-AS1, TMTC1, ERGIC2, IPO8, FAR2, CCDC91, PTHLH, CAPRIN2, MRPS35 Negative: EPB41L1, AL121895.1, CNBD2, AAR2, DLGAP4, MYL9, SOGA1, SAMHD1, TGIF2, DSN1 RBL1, SCAND1, SLA2, NDRG3, MROH8, PHF20, RPN2, RBM39, MANBAL, ROMO1 SRC, BLCAP, NFS1, NNAT, RBM12, CTNNBL1, CPNE1, SPAG4, TTI1, ERGIC3 PC_ 5 Positive: PPIL6, CD164, SMPD2, CEP57L1, MICAL1, SESN1, ZBTB24, ARMC2, AK9, FOXO3 FIG4, WASF1, SNX3, CDC40, NR2E1, CDK19, OSTM1, AMD1, SEC63, GTF3C6 SOBP, RPF2, PDSS2, SLC16A10, BEND3, REV3L, C6orf203, FDPS, RUSC1-AS1, PKLR Negative: ZNF326, LRRC8D, LRRC8C, ZNF644, TOMM20, CDC7, LRRC8B, IRF2BP2, RBM34, ARID4B TGFBR3, GBP4, TARBP1, GGPS1, BRDT, TBCE, COA6, GBP2, B3GALNT2, EPHX4 GNG4, BTBD8, SLC35F3, LYST, GBP1, C1orf146, KCNK1, GLMN, NID1, RBMXL1 Computing nearest neighbor graph Computing SNN Warning: Data is of class matrix. Coercing to dgCMatrix. Finding variable features for layer counts Calculating gene variances 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating feature variances of standardized and clipped values 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Centering and scaling data matrix | | | 0% | |======= | 10% | |============== | 20% | |===================== | 30% | |============================ | 40% | |=================================== | 50% | |========================================== | 60% | |================================================= | 70% | |======================================================== | 80% | |=============================================================== | 90% | |======================================================================| 100% PC_ 1 Positive: CADM2, CHMP2B, GBE1, CGGBP1, ROBO2, ZNF717, ABI3BP, TFG, LINC00960, ZNF654 ROBO1, TMEM45A, CRYBG3, GABRR3, LNP1, ARL6, MTRNR2L12, C3orf38, FRG2C, CLDND1 SENP7, NSUN3, NIT2, PROS1, ARL13B, FILIP1L, TBC1D23, CPOX, RPL24, ZBTB11-AS1 Negative: XKR8, SMPDL3B, RPA2, PPP1R8, THEMIS2, EYA3, STX12, FAM76A, PTAFR, IFI6 DNAJC8, SESN2, AHDC1, MED18, WASF2, GPR3, PHACTR4, MAP3K6, RCC1, RMDN3 SYTL1, TMEM222, RAD51, TTLL1, TRNAU1AP, WDTC1, BIK, ZNFX1, ZFAS1, KCNB1 PC_ 2 Positive: SPAST, SLC30A6, LCLAT1, NLRC4, LBH, YPEL5, AC016907.2, YIPF4, CLIP4, BIRC6 WDR43, TTC27, TRMT61B, LTBP1, SPDYA, RASGRP3, FAM98A, PPP1CB, CRIM1, PLB1 FEZ2, FOSL2, VIT, RBKS, MRPL33, STRN, SLC4A1AP, HEATR5B, SUPT7L, GPN1 Negative: AGAP4, ZFAND4, TPBG, MARCH8, UBE3D, IBTK, BCKDHB, ALOX5, DOPEY1, ZNF22 TTK, PGM3, C10orf25, RWDD2A, RASSF4, CXCL12, ZNF32, ELOVL4, ZNF485, ME1 ZNF239, ZNF487, PRSS35, SH3BGRL2, HNRNPF, LCA5, FXYD4, HMGN3-AS1, RASGEF1A, HMGN3 PC_ 3 Positive: RPL10L, MDGA2, MIS18BP1, LINC00648, FANCM, RPS29, FKBP3, AL139099.1, PRPF39, LRR1 KLHL28, RPL36AL, C14orf28, MGAT2, LRFN5, DNAAF2, FBXO33, POLE2, MIA2, KLHDC2 PNN, NEMF, TRAPPC6B, AL627171.1, GEMIN2, ARF6, SEC23A, VCPKMT, MIPOL1, SOS2 Negative: NR2F6, OCEL1, USE1, MYO9B, HAUS8, RAB24, NSD1, MXD3, PRELID1, FGFR4 LMAN2, CPAMD8, ZNF346, RGS14, F12, GRK6, UIMC1, SIN3B, PRR7, TSPAN17 TMEM38A, DBN1, EIF4E1B, PDLIM7, SMIM7, SNCB, DOK3, MED26, GPRIN1, DDX41 PC_ 4 Positive: SYNM, IGF1R, TTC23, ARRDC4, LRRC28, NR2F2, MEF2A, NR2F2-AS1, LYSMD4, MCTP2 ADAMTS17, RGMA, ASB7, CHD2, ALDH1A3, FAM174B, ETFA, NRG4, LRRK1, FBXO22 ST8SIA2, PYGO1, C15orf65, CCPG1, UBE2Q2, PIGB, CHSY1, SLCO3A1, RAB27A, SNX33 Negative: SYT13, PRDM11, SLC35C1, TP53I11, CRY2, MAPK8IP1, TSPAN18, PEX16, PHF21A, CREB3L1 DGKZ, CD82, MDK, AMBRA1, HARBI1, EXT2, ATG13, ACCSL, ARHGAP1, C11orf96 ZNF408, ALKBH3, F2, CKAP5, HSD17B12, LRP4, C11orf49, PACSIN3, ARFGAP2, DDB2 PC_ 5 Positive: DBN1, PDLIM7, PRR7, GRK6, B4GALT7, TMED9, DOK3, FAM193B, DDX41, F12 N4BP3, RMND5B, RGS14, NHP2, HNRNPAB, PHYKPL, COL23A1, LMAN2, CLK4, ZNF354A ZNF354B, ZFP2, ZNF454, PRELID1, ZNF879, ZNF354C, MXD3, RAB24, NSD1, FGFR4 Negative: ZNF256, C19orf18, ZNF135, ZNF606, ZSCAN18, ZNF329, ZNF274, ZNF544, ZNF8, ERVK3-1 AC020915.1, ZNF154, ZNF551, ZSCAN4, ZNF211, AC010642.2, ZNF134, ZNF530, ZSCAN22, ZIK1 ZNF416, A1BG, ZNF550, ZNF549, A1BG-AS1, ZNF773, ZNF419, ZNF497, ZNF772, ZNF749 Computing nearest neighbor graph Computing SNN Warning: Data is of class matrix. Coercing to dgCMatrix. Finding variable features for layer counts Calculating gene variances 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating feature variances of standardized and clipped values 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Centering and scaling data matrix | | | 0% | |======= | 10% | |============== | 20% | |===================== | 30% | |============================ | 40% | |=================================== | 50% | |========================================== | 60% | |================================================= | 70% | |======================================================== | 80% | |=============================================================== | 90% | |======================================================================| 100% PC_ 1 Positive: GNB5, MYO5C, BCL2L10, MYO5A, ARPP19, FAM214A, WDR72, DUOX1, UNC13C, DUOXA1 SHF, GATM, SPATA5L1, SORD, SLC30A4, RSL24D1, BLOC1S6, SEMA6D, MYEF2, DUT TRIM69, RAB27A, EID1, FBN1, CEP152, PIGB, SECISBP2L, COPS2, GALK2, B2M Negative: CKAP5, LRP4, F2, C11orf49, ZNF408, ARFGAP2, ARHGAP1, PACSIN3, ATG13, DDB2 ACP2, HARBI1, NR1H3, AMBRA1, MDK, MADD, SLC39A13, DGKZ, CREB3L1, PHF21A PEX16, MAPK8IP1, CRY2, SLC35C1, SYT13, RAD23A, GADD45GIP1, AC092069.1, CALR, FARSA PC_ 2 Positive: ABCC6, NOMO3, ABCC1, XYLT1, NOMO2, FOPNL, RPS15A, ARL6IP1, MYH11, SMG1 NDE1, TMC7, C16orf45, DTL, MPV17L, COQ7, RRN3, PPP2R5A, ITPRIPL2, NTAN1 TMEM206, PDXDC1, SYT17, NPIPA1, NENF, TMC5, NOMO1, GDE1, ATF3, PLA2G10 Negative: APOBEC3B, CBX6, DNAL4, SUN2, AL021707.2, GTPBP1, JOSD1, TOMM22, CBY1, FAM227A DMC1, DDX17, KDELR3, SCAMP5, PPCDC, RPP25, C15orf39, COMMD4, COX5A, NEIL1 FAM219B, MAN2C1, MPI, SIN3A, CSNK1E, SCAMP2, PTPN9, ULK3, SNUPN, TMEM184B PC_ 3 Positive: PHLPP2, AP1G1, MARVELD3, ATXN1L, CHST4, IST1, ZNF821, ZNF19, PKD1L3, ZNF23 DHODH, ZFHX3, PMFBP1, TXNL4B, DHX38, PSMD7, NPIPB15, GLG1, RFWD3, MLKL NAGPA, C16orf89, ALG1, RBFOX1, FA2H, PPL, METTL22, ABAT, TMEM186, PMM2 Negative: APBB2, UCHL1, LIMCH1, TMEM33, DCAF4L1, SLC30A9, BEND4, SHISA3, ATP8A1, GUF1 GNPDA2, COX7B2, COMMD8, ATP10D, NFXL1, NIPAL1, ZNF490, ZNF564, ZNF443, ZNF799 ZNF709, ZNF791, ZNF442, MAN2B1, ZNF625-ZNF20, ZNF625, ZNF20, CNGA1, ZNF563, ZNF44 PC_ 4 Positive: USP18, DGCR2, TUBA8, SLC25A1, PEX26, MRPL40, CLTCL1, CDC45, C22orf39, HIRA MICAL3, CLDN5, GP1BB, GNB1L, TXNRD2, COMT, TANGO2, DGCR8, TRMT2A, RANBP1 ZDHHC8, RTN4R, DGCR6L, FGFR1OP2, TM7SF3, ITPR2, AC024896.1, RASSF8, MED21, RASSF8-AS1 Negative: ZNF688, ZNF785, MMP15, KIFC3, ZNF689, TEPP, USB1, KATNB1, PRR14, TMEM106A FBRS, NBR1, LINC00910, BRCA1, ARL4D, SRCAP, DHX8, ETV4, RND2, MPP3 DUSP3, MPP2, PPY, DEK, VAT1, PHKG2, PYY, IFI35, NAGS, RNF40 PC_ 5 Positive: FIP1L1, LNX1, SCFD2, CHIC2, RASL11B, PDGFRA, ERVMER34-1, KIT, USP46, KDR SPATA18, SRD5A3, SGCB, TMEM165, CLOCK, DCUN1D4, PDCL2, OCIAD2, NMU, EXOC1 OCIAD1, CEP135, FRYL, KIAA1211, ZAR1, AASDH, SLC10A4, C6orf99, EZR, SYTL3 Negative: SLC7A9, TDRD12, NUDT19, ANKRD27, PDCD5, DPY19L3, ZNF507, TSHZ3, URI1, CCNE1 C19orf12, PLEKHF1, POP4, ZNF101, ATP13A1, GMIP, C6orf203, BEND3, LPAR2, PDSS2 SOBP, SEC63, OSTM1, NR2E1, SNX3, FOXO3, ARMC2, SESN1, CEP57L1, CD164 Computing nearest neighbor graph Computing SNN Warning: Data is of class matrix. Coercing to dgCMatrix. Finding variable features for layer counts Calculating gene variances 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating feature variances of standardized and clipped values 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Centering and scaling data matrix | | | 0% | |======= | 10% | |============== | 20% | |===================== | 30% | |============================ | 40% | |=================================== | 50% | |========================================== | 60% | |================================================= | 70% | |======================================================== | 80% | |=============================================================== | 90% | |======================================================================| 100% PC_ 1 Positive: PPP2R3A, MSL2, EPHB1, CEP63, PCCB, ANAPC13, ELMOD1, SLN, AMOTL2, ALKBH8 STAG1, SLC35F2, CWF19L2, SLC35G2, RYK, RAB39A, IER5, CUL5, NCK1, GLUL SLCO2A1, ACAT1, RNASEL, IL20RB, NPAT, RGS16, RAB6B, ATM, CLDN18, RGS8 Negative: ALDH6A1, LIN52, ABCD4, VRTN, NPC2, ISCA2, LTBP2, AREL1, FCF1, YLPM1 DLST, RPS6KL1, PGF, EIF2B2, MLH3, ACYP1, ZC2HC1C, NEK9, ZNF442, ZNF799 ZNF563, ZNF443, ZNF44, ZNF709, TMED10, ZNF136, ZNF564, ZNF625, ZNF490, ZNF791 PC_ 2 Positive: TSPYL1, RWDD1, DSE, RSPH4A, KPNA5, TSPYL4, FAM162B, ROS1, NT5DC1, GOPC DCBLD1, FRK, NUS1, HDAC2, SLC35F1, MARCKS, CEP85L, MCM9, RFPL4B, ASF1A LAMA4, FAM184A, FAM229B, MAN1A1, TUBE1, TBC1D32, FYN, TRAF3IP2, GJA1, TRAF3IP2-AS1 Negative: PIM3, CRELD2, ALG12, ZBED4, BRD1, C22orf34, FAM19A5, TBC1D22A, CERK, GRAMD4 CELSR1, TRMU, GTSE1, MSX1, TTC38, CYTL1, CDPF1, PPARA, STK32B, WNT7B ATXN10, EVC2, ASL, CRCP, FBLN1, GUSB, VKORC1L1, ZNF92, ERV3-1, ZNF117 PC_ 3 Positive: FKTN, TMEM38B, ZNF462, RAD23B, KLF4, FAM206A, CTNNAL1, TMEM245, EPB41L4B, PTPN3 AKAP2, EEF1A2, KCNQ2, PPDPF, GMEB2, ARFGAP1, STMN3, NKAIN4, YTHDF1, RTEL1 GID8, RTEL1-TNFRSF6B, DIDO1, C9orf152, TCFL5, COL9A3, OGFR, TXN, MRGBP, UMPS Negative: SPOCK2, CHST3, ASCC1, PSAP, ANAPC16, CDH23, DDIT4, SLC29A3, DNAJB12, UNC5B MICU1, PCBD1, MCU, SGPL1, ADAMTS14, OIT3, NODAL, PLA2G12B, EIF4EBP2, P4HA1 NUDT13, LRRC20, ECD, PPA1, FAM149B1, SAR1A, DNAJC9, MRPS16, TYSND1, DNAJC9-AS1 PC_ 4 Positive: OPA1, HRASLS, HES1, MB21D2, FGF12, CCDC50, UTS2B, IL1RAP, LINC00884, CLDN1 TMEM44-AS1, TMEM44, LSG1, AC046143.1, FAM43A, XXYLT1, ACAP2, PPP1R2, APOD, MUC20 MUC4, TNK2, TFRC, SLC51A, PCYT1A, TCTEX1D2, TM4SF19, UBXN7, RNF168, PITX1 Negative: NOMO2, XYLT1, NOMO3, ABCC6, RPS15A, ABCC1, FOPNL, MYH11, ARL6IP1, NDE1 C16orf45, SMG1, MPV17L, RRN3, TMC7, NTAN1, COQ7, PDXDC1, NPIPA1, NOMO1 PLA2G10, ITPRIPL2, BFAR, PARN, SYT17, MKL2, TMC5, ERCC4, CPPED1, GDE1 PC_ 5 Positive: CNIH3, CAPN2, DNAH14, WDR26, SUSD4, TLR5, TP53BP2, CNIH4, NVL, FBXO28 DEGS1, DISP1, BROX, AIDA, MIA3, TAF1A, SLC35A5, ATP6V1A, GRAMD1C, CCDC80 HHIPL2, ZDHHC23, TIGIT, NAA50, ZBTB20, GAP43, SPICE1, CD200R1, DUSP10, LSAMP Negative: CHRNB1, ZBTB4, FGF11, POLR2A, TMEM102, TNFSF12, NLGN2, TNFSF12-TNFSF13, TMEM256, TNK1 TNFSF13, KCTD11, SENP3, ACAP1, SENP3-EIF4A1, NEURL4, EIF4A1, GPS2, CD68, EIF5A MPDU1, YBX2, SOX15, SLC2A4, FXR2, CLDN7, ELP5, SAT2, ATP1B2, TP53 Computing nearest neighbor graph Computing SNN Warning: Data is of class matrix. Coercing to dgCMatrix. Finding variable features for layer counts Calculating gene variances 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating feature variances of standardized and clipped values 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Centering and scaling data matrix | | | 0% | |======= | 10% | |============== | 20% | |===================== | 30% | |============================ | 40% | |=================================== | 50% | |========================================== | 60% | |================================================= | 70% | |======================================================== | 80% | |=============================================================== | 90% | |======================================================================| 100% PC_ 1 Positive: EIF3E, RSPO2, ANGPT1, EMC2, OXR1, ZFPM2, TMEM74, MECOM, GOLIM4, TERC RPS23, ATP6AP1L, LRP12, NUDCD1, TMEM167A, SERPINI1, CKMT2, CKMT2-AS1, ZCCHC9, ATG10 DPYS, ENY2, XRCC4, SSBP2, RASGRF2, ACTRT3, PDCD10, RIMS2, VCAN, MSH3 Negative: NNAT, CTNNBL1, BLCAP, MANBAL, SRC, TTI1, RPN2, MROH8, RPRD1B, LBP RALGAPB, TGM2, ACTR5, RBL1, PPP1R16B, SAMHD1, FAM83D, TGIF2, SOGA1, SLA2 NDRG3, DSN1, MYL9, DHX35, TOP1, DLGAP4, PLCG1, ZHX3, AAR2, LPIN3 PC_ 2 Positive: ZNF446, ZNF324, ZNF324B, ZNF132, ZNF584, AC012313.1, RPS5, ZNF497, A1BG-AS1, A1BG ZSCAN22, AC010642.2, AC020915.1, ERVK3-1, ZNF8, ZNF544, ZNF274, ZNF329, ZSCAN18, ZNF135 ZNF606, C19orf18, ZNF256, RTEL1, RTEL1-TNFRSF6B, ANKLE1, ABHD8, BABAM1, USHBP1, MRPL34 Negative: STAG1, SLC35G2, NCK1, IL20RB, PCCB, CLDN18, DZIP1L, MSL2, DBR1, PPP2R3A ARMC8, EPHB1, NME9, CEP63, MRAS, ESYT3, ANAPC13, CEP70, FAIM, AMOTL2 PIK3CB, RYK, PRR23A, SLCO2A1, MRPS22, RAB6B, PRR23B, PRR23C, SRPRB, COPB2 PC_ 3 Positive: HTRA4, TM2D2, PLEKHA2, ADAM9, TACC1, ADAM32, FGFR1, LETM2, ADAM18, DDHD2 IDO1, BAG4, ZMAT4, LSM1, SFRP1, STAR, ASH2L, GOLGA7, EIF4EBP1, GINS4 RAB11FIP1, ANK1, BRF2, ERLIN2, KAT6A, ZNF703, AP3M2, UNC5D, PLAT, RNF122 Negative: ARMC8, DBR1, DZIP1L, NME9, CLDN18, MRAS, ESYT3, IL20RB, CEP70, FAIM NCK1, SLC35G2, PIK3CB, STAG1, PRR23A, PCCB, MRPS22, MSL2, SLCO2A1, RYK AMOTL2, PPP2R3A, EPHB1, PRR23B, ANAPC13, RAB6B, CEP63, PRR23C, SRPRB, COPB2 PC_ 4 Positive: DNAJC13, ACAD11, ACPP, NPHP3, CPNE4, UBA5, MRPL3, NUDT16, TMEM108, NEK11 ASTE1, BFSP2-AS1, ATP2C1, CDV3, PIK3R4, CDKL1, ATP5S, L2HGDH, MAP4K5, SOS2 ATL1, VCPKMT, TOPBP1, SAV1, COL6A6, ARF6, NIN, AL627171.1, TF, PYGL Negative: WAC-AS1, WAC, MPP7, BAMBI, ARMC4, SVIL, RAB18, ACBD5, MTPAP, MASTL YME1L1, MAP3K8, ANKRD26, ABI1, PDSS1, ZNF438, APBB1IP, MYO3A, ZEB1-AS1, GPR158 ZEB1, THNSL1, ARHGAP12, ENKUR, KIF5B, PRTFDC1, EPC1, ARHGAP21, ITGB1, KIAA1217 PC_ 5 Positive: TULP4, SERAC1, GTF2H5, SYNJ2, TMEM181, SNX9, DYNLT1, ZDHHC14, SYTL3, EZR TMEM242, C6orf99, RSPH3, ARID1B, SOD2, TFB1M, WTAP, ACAT2, TIAM2, TCP1 SCAF8, MRPL18, CNKSR3, PNLDC1, IGF2R, IPCEF1, SLC22A3, RGS17, PLG, MTRF1L Negative: NEK11, ASTE1, ATP2C1, NUDT16, PIK3R4, MRPL3, COL6A6, CPNE4, COL6A5, ACPP DNAJC13, TRH, ACAD11, NPHP3, UBA5, TMEM108, TMCC1-AS1, BFSP2-AS1, TMCC1, CDV3 TOPBP1, TF, SRPRB, PUM2, SDC1, RAB6B, LAPTM4A, RHOB, SLCO2A1, PLXND1 Computing nearest neighbor graph Computing SNN Warning: Data is of class matrix. Coercing to dgCMatrix. Finding variable features for layer counts Calculating gene variances 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating feature variances of standardized and clipped values 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Centering and scaling data matrix | | | 0% | |======= | 10% | |============== | 20% | |===================== | 30% | |============================ | 40% | |=================================== | 50% | |========================================== | 60% | |================================================= | 70% | |======================================================== | 80% | |=============================================================== | 90% | |======================================================================| 100% PC_ 1 Positive: ZBTB34, ZBTB43, LMX1B, MVB12B, PBX3, MAPKAP1, GAPVD1, PPDPF, GMEB2, STMN3 HSPA5, EEF1A2, RTEL1, RTEL1-TNFRSF6B, KCNQ2, ARFGAP1, RABEPK, NKAIN4, YTHDF1, PPP6C GID8, SCAI, DIDO1, GOLGA1, TCFL5, ARPC5L, COL9A3, RPL35, OGFR, WDR38 Negative: SNX18, HSPB3, LINC01033, ARL15, NDUFS4, FST, MOCS2, ITGA2, PELO, ITGA1 ISL1, LHFPL2, ARSB, DMGDH, SCAMP1, BHMT2, BHMT, JMY, AP3B1, PARP8 HOMER1, CMYA5, TBCA, MTX3, WDR41, EMB, THBS4, PDE8B, VCAN, SERINC5 PC_ 2 Positive: KIAA2026, RANBP6, UHRF2, GLDC, KDM4C, PTPRD, IFRD1, ZNF277, DOCK4, IMMP2L DNAJB9, THAP5, PNPLA8, NRCAM, LAMB1, DLD, CBLL1, BCAP29, DUS4L, TBC1D9 RNF150, ZNF330, ELMOD2, IL15, CLGN, INPP4B, RBPMS, SCOC, USP38, GTF2E2 Negative: SLC35F6, CENPA, DRC1, HADHB, DPYSL5, HADHA, RAB10, ASXL2, KIF3C, DTNB MAPRE3, DNMT3A, TMEM214, POMC, AGBL5, EFR3B, OST4, DNAJC27-AS1, KHK, CGREF1 PREB, DNAJC27, SLC5A6, ADCY3, ATRAID, CENPO, CAD, PTRHD1, NCOA1, SLC30A3 PC_ 3 Positive: LMAN2, F12, PRELID1, RGS14, GRK6, MXD3, PRR7, DBN1, RAB24, PDLIM7 DOK3, NSD1, DDX41, FGFR4, FAM193B, ZNF346, TMED9, UIMC1, B4GALT7, TSPAN17 N4BP3, EIF4E1B, SNCB, RMND5B, GPRIN1, NHP2, RNF44, HNRNPAB, FAF2, PHYKPL Negative: CNOT2, MYRFL, RAB3IP, CCT2, FRS2, YEATS4, LYZ, CPSF6, CPM, MDM2 SLC35E3, NUP107, RAP1B, MDM1, DYRK2, CAND1, GRIP1, HELB, IRAK3, TMBIM4 LLPH, HMGA2, MSRB3, LEMD3, PDZRN4, GXYLT1, SLC2A13, TBC1D30, C12orf40, YAF2 PC_ 4 Positive: PWP2, DNMT3L, TRAPPC10, AIRE, PFKL, UBE2G2, C21orf2, LRRC3, TRPM2, AP001062.1 SUMO3, PTTG1IP, ITGB2, FAM207A, SLC6A19, SLC12A7, ZDHHC11, BRD9, ZDHHC11B, TRIP13 TPPP, AC026740.1, AC116351.1, NKD2, TERT, CEP72, SLC9A3, EXOC3, AHRR, PDCD6 Negative: SF3A1, CCDC157, RNF215, SEC14L2, MTFP1, SEC14L4, PES1, TCN2, SLC35E4, DUSP18 OSBP2, MORC2-AS1, MORC2, SMTN, INPP5J, PLA2G3, RNF185, LIMK2, PIK3IP1, PATZ1 DRG1, EIF4ENIF1, SFI1, PISD, PRR14L, DEPDC5, YWHAH, RFPL2, RFPL3S, RTCB PC_ 5 Positive: FBXL7, ZNF622, BASP1, ANKH, MYO10, DNAH5, TRIO, CDH18, DAP, ANKRD33B ROPN1L, CDH12, MARCH6, CMBL, C5orf17, CCT5, CDH10, FAM173B, SNHG18, CDH6 SEMA5A, MIR4458HG, FASTKD3, MTRR, C5orf49, SRD5A1, NSUN2, MED10, LINC01019, IRX2 Negative: TRPT1, NUDT22, DNAJC4, ELL3, VEGFB, SERF2, ZNF446, FKBP2, HYPK, ZNF324 ZNF324B, PPP1R14B, ZNF132, MFAP1, ZNF584, PLCB3, WDR76, BAD, FRMD5, GPR137 ESRRA, CASC4, ZNF23, ZNF19, TRMT112, CHST4, MARVELD3, PHLPP2, CTDSPL2, AP1G1 Computing nearest neighbor graph Computing SNN Warning: Data is of class matrix. Coercing to dgCMatrix. Finding variable features for layer counts Calculating gene variances 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating feature variances of standardized and clipped values 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Centering and scaling data matrix | | | 0% | |======= | 10% | |============== | 20% | |===================== | 30% | |============================ | 40% | |=================================== | 50% | |========================================== | 60% | |================================================= | 70% | |======================================================== | 80% | |=============================================================== | 90% | |======================================================================| 100% PC_ 1 Positive: NCALD, RRM2B, GRHL2, ZNF706, UBR5, KLF10, YWHAZ, AZIN1, ATP6V1C1, BAALC FZD6, PABPC1, CTHRC1, SLC25A32, ANKRD46, DCAF13, RIMS2, RNF19A, DPYS, LRP12 SPAG1, ZFPM2, POLR2K, OXR1, FBXO43, ANGPT1, LINC01030, RSPO2, COX6C, DLGAP5 Negative: ZSWIM4, C19orf57, AC008686.1, CC2D1A, C19orf53, MRI1, DCAF15, CCDC130, CACNA1A, RFX1 IER2, IL27RA, STX10, NACC1, PALM3, C19orf67, TRMT1, SAMD1, PRKACA, LYL1 ASF1B, DDX39A, DAND5, PKN1, AC092069.1, GIPC1, DNAJB1, GADD45GIP1, TECR, RAD23A PC_ 2 Positive: RNF38, GNE, CLTA, CCIN, GLIPR2, RECK, TMEM8B, HINT2, NPR2, RGP1 GBA2, CREB3, TLN1, TPM2, ARHGEF39, CCDC107, RMRP, UBAP1, CD72, KIF24 NUDT2, TESK1, C9orf24, RAD23B, ZNF462, FAM219A, KLF4, TMEM38B, FKTN, FAM206A Negative: SAR1B, SEC24A, CDKN2AIPNL, UBE2B, CDKL3, CAMLG, PPP2CA, DDX46, SKP1, C5orf24 TCF7, VDAC1, TXNDC15, C5orf15, PCBD2, FSTL4, CATSPER3, HSPA4, PITX1, H2AFY ZCCHC10, NEUROG1, AFF4, CXCL14, SLC25A48, LEAP2, LECT2, UQCRQ, TGFBI, GDF9 PC_ 3 Positive: ZNF273, ZNF138, ZNF117, ERV3-1, ZNF107, ZNF92, ZNF680, VKORC1L1, ZNF736, GUSB ZNF679, ASL, ZNF727, CRCP, TPST1, ZNF716, KCTD7, CHCHD2, PHKG1, RABGEF1 SUMF2, CCT6A, TMEM248, PSPH, SBDS, MRPS17, TYW1, DSCR4, DSCR8, DYRK1A Negative: LTB4R2, LTB4R, NYNRIN, KHNYN, SDR39U1, STXBP6, NOVA1, FOXG1, PRKD1, G2E3 SCFD1, COCH, STRN3, AP4S1, HECTD1, HEATR5A, AL136418.1, ARSI, CAMK2A, TCOF1 CD74, CDX1, PDGFRB, RPS14, DTD2, SYNPO, MYOZ3, NDST1, CSF1R, RBM22 PC_ 4 Positive: SLC7A11, INPP4B, PCDH18, USP38, PCDH10, GAB1, C4orf33, SMARCA5, SCLT1, PGRMC2 SMARCA5-AS1, LARP1B, GYPE, MFSD8, PLK4, HSPA4L, SLC25A31, INTU, ANKRD50, SPRY1 SPATA5, NUDT6, FGF2, BBS12, ADAD1, CCNA2, EXOSC9, TMEM155, KIAA1109, BBS7 Negative: MTFP1, SEC14L2, SEC14L4, PES1, RNF215, TCN2, SLC35E4, DUSP18, CCDC157, OSBP2 SF3A1, MORC2-AS1, MORC2, SMTN, INPP5J, RNF185, PLA2G3, LIMK2, PIK3IP1, PATZ1 DRG1, EIF4ENIF1, SFI1, PISD, PRR14L, DEPDC5, APOBEC3B, YWHAH, CBX6, RFPL2 PC_ 5 Positive: N4BP3, B4GALT7, RMND5B, TMED9, NHP2, HNRNPAB, FAM193B, PHYKPL, COL23A1, DDX41 CLK4, DOK3, ZNF354A, PDLIM7, DBN1, ZNF354B, PRR7, ZFP2, GRK6, SRCAP PRR14, FBRS, ZNF454, PHKG2, ZNF689, ZNF785, F12, ZNF688, ZNF879, RGS14 Negative: SEMA5A, MIR4458HG, FASTKD3, MTRR, EFCAB10, ATXN7L1, RINT1, CDHR3, PUS7, SYPL1 C5orf49, NAMPT, SRPK2, CCDC71L, PRKAR2B, HBP1, KMT2E, COG5, SRD5A1, KMT2E-AS1 GPR22, DUS4L, BCAP29, CBLL1, DLD, LINC01004, NSUN2, LAMB1, NRCAM, LHFPL3 Computing nearest neighbor graph Computing SNN Warning: Data is of class matrix. Coercing to dgCMatrix. Finding variable features for layer counts Calculating gene variances 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating feature variances of standardized and clipped values 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Centering and scaling data matrix | | | 0% | |======= | 10% | |============== | 20% | |===================== | 30% | |============================ | 40% | |=================================== | 50% | |========================================== | 60% | |================================================= | 70% | |======================================================== | 80% | |=============================================================== | 90% | |======================================================================| 100% PC_ 1 Positive: STIP1, MACROD1, OTUB1, COX8A, NAA40, MYBL2, TOX2, JPH2, RCOR2, IFT52 OSER1, MARK2, FITM2, HNF4A, L3MBTL1, TTPAL, SERINC3, PLCG1, ZHX3, C11orf95 TOP1, FAM83D, LPIN3, DHX35, PPP1R16B, PKIG, RTN3, ACTR5, EMILIN3, SRSF6 Negative: ACTRT3, TERC, MYNN, LRRC34, MECOM, SEC62, GOLIM4, GPR160, PHC3, SERPINI1 PRKCI, SKIL, PDCD10, CLDN11, RPL22L1, ZBBX, EIF5A2, VCAN, BCHE, TNIK XRCC4, TMEM167A, PLD1, SPTSSB, ATP6AP1L, FNDC3B, NMD3, RPS23, NCEH1, ATG10 PC_ 2 Positive: DNAJA1, SMU1, APTX, B4GALT1, BAG1, PRSS3, NDUFB6, UBE2R2, CHMP5, NFX1 ANKRD18B, AQP3, NOL6, UBAP2, TOPORS, DDX58, DCAF12, ACO1, UBAP1, KIF24 LINGO2, NUDT2, C9orf24, C9orf72, FAM219A, DNAI1, ENHO, CNTFR, MOB3B, RPP25L Negative: ZNF584, AC012313.1, ZNF132, ZNF324B, RPS5, ZNF324, ZNF497, ZNF446, A1BG-AS1, A1BG ZSCAN22, AC010642.2, AC020915.1, ERVK3-1, ZNF8, ZNF544, ZNF274, ZNF329, ZSCAN18, ZNF135 ZNF606, C19orf18, ZNF256, PARD6G, ADNP2, RBFADN, TRMT2A, RBFA, TXNL4A, DGCR8 PC_ 3 Positive: DNAAF2, MGAT2, POLE2, RPL36AL, LRR1, KLHDC2, AL139099.1, RPS29, LINC00648, AL627171.1 NEMF, ARF6, L2HGDH, SOS2, VCPKMT, MDGA2, ATP5S, RPL10L, MIS18BP1, CDKL1 FANCM, MAP4K5, FKBP3, PRPF39, ATL1, KLHL28, SAV1, C14orf28, LRFN5, NIN Negative: AC016907.2, YPEL5, LBH, CLIP4, WDR43, LCLAT1, TRMT61B, SPDYA, CAPN13, PPP1CB PLB1, GALNT14, FOSL2, EHD3, RBKS, MEMO1, MRPL33, DPY30, SLC4A1AP, SPAST SUPT7L, SLC30A6, GPN1, NLRC4, CCDC121, YIPF4, ZNF512, BIRC6, C2orf16, TTC27 PC_ 4 Positive: TRAF3IP2-AS1, REV3L, TRAF3IP2, SLC16A10, FYN, TUBE1, FAM229B, RPF2, LAMA4, RFPL4B GTF3C6, MARCKS, AMD1, HDAC2, CDK19, FRK, CDC40, NT5DC1, WASF1, FIG4 TSPYL4, AK9, DSE, ZBTB24, TSPYL1, MICAL1, RWDD1, SMPD2, RSPH4A, PPIL6 Negative: DNA2, SLC25A16, TET1, RUFY2, CCAR1, HNRNPH3, STOX1, PBLD, DDX50, DDX21 HERC4, SRGN, SIRT1, VPS26A, DNAJC12, SUPV3L1, CTNNA3, HKDC1, REEP3, HK1 JMJD1C-AS1, TSPAN15, JMJD1C, COL13A1, NRBF2, H2AFY2, EGR2, AIFM2, ADO, TYSND1 PC_ 5 Positive: SCFD2, RASL11B, ERVMER34-1, FIP1L1, LNX1, USP46, SPATA18, CHIC2, SGCB, PDGFRA DCUN1D4, KIT, OCIAD2, KDR, OCIAD1, SRD5A3, FRYL, TMEM165, ZAR1, CLOCK SLC10A4, PDCL2, SLAIN2, NMU, TEC, EXOC1, TXK, CEP135, AC107068.1, KIAA1211 Negative: GPR176, EIF2AK4, FSIP1, SRP14, THBS1, RASGRP1, SRP14-AS1, FAM98B, SPRED1, BMF MEIS2, C15orf41, BUB1B, DPH6, PAK6, NANOGP8, PLCB2, ZNF770, KNSTRN, AQR IVD, ACTC1, BAHD1, GOLGA8B, GOLGA8A, LPCAT4, NUTM1, NOP10, PPA1, SAR1A Computing nearest neighbor graph Computing SNN Warning: Data is of class matrix. Coercing to dgCMatrix. Finding variable features for layer counts Calculating gene variances 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating feature variances of standardized and clipped values 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Centering and scaling data matrix | | | 0% | |======= | 10% | |============== | 20% | |===================== | 30% | |============================ | 40% | |=================================== | 50% | |========================================== | 60% | |================================================= | 70% | |======================================================== | 80% | |=============================================================== | 90% | |======================================================================| 100% PC_ 1 Positive: RBM12B, TMEM67, TRIQK, PDP1, RUNX1T1, CDH17, GEM, RAD54B, LRRC69, OTUD6B FSBP, ESRP1, NECAB1, TMEM64, DPY19L4, LINC01030, INTS8, CCNE2, MARCH1, TMA16 NPY1R, TRIM61, NAF1, RAPGEF2, FAM218A, TRIM60, NDUFAF6, TMEM192, KLHL2, MSMO1 Negative: MAP1S, FCHO1, COLGALT1, JAK3, RPL18A, PGLS, SLC5A5, SLC27A1, CCDC124, MVB12A KCNN1, BST2, ARRDC2, PLVAP, MAST3, PIK3R2, GTPBP3, DDA1, IFI30, MRPL34 MPV17L2, ABHD8, RAB3A, ANKLE1, PDE4C, BABAM1, JUND, USHBP1, NR2F6, LSM4 PC_ 2 Positive: KCNK9, TRAPPC9, CHRAC1, COL22A1, AGO2, PTK2, DENND3, SLC45A4, GPR20, PTP4A3 TSNARE1, CYTL1, STK32B, MSX1, EVC2, JRK, EVC, CRMP1, WFS1, PPP2R2C MAN2B2, MRFAP1, AC093323.1, S100P, MRFAP1L1, BLOC1S4, KIAA0232, TBC1D14, TADA2B, GRPEL1 Negative: ZNF610, LHX4, ZNF480, ACBD6, XPR1, ZNF766, ZNF154, ZNF551, ZSCAN4, ZNF211 STX6, ZNF134, ZNF530, MR1, PPP2R1A, ZIK1, IER5, ZNF416, ZNF836, GLUL ZNF550, RNASEL, ZNF549, ZNF616, RGS16, CDC42SE1, ZNF773, C1orf56, MLLT11, ZNF256 PC_ 3 Positive: KCNK17, KIF6, KCNK5, SAYSD1, DAAM2, GLO1, MOCS1, TDRG1, BTBD9, UNC5CL ZFAND3, OARD1, CCDC167, APOBEC2, NFYA, CMTR1, TREML2, RNF8, TREM1, TBC1D22B FOXP4, MDFI, SEMA7A, UBL7, TFEB, UBL7-AS1, CYP11A1, ARID3B, CLK3, CCDC33 Negative: PRAMEF10, PRAMEF6, PRAMEF4, PRAMEF5, PRAMEF2, PRAMEF8, HNRNPCL1, PRAMEF9, PRAMEF11, PRAMEF13 PRAMEF1, PRAMEF18, PRAMEF12, PRAMEF15, AADACL3, PRAMEF14, DHRS3, PRAMEF19, PRAMEF17, VPS13D PRAMEF20, TNFRSF1B, LRRC38, PDPN, TNFRSF8, PRDM2, MIIP, KAZN, MFN2, FAM151A PC_ 4 Positive: ZNF264, DUXA, AURKC, ZIM3, ZNF805, USP29, ZNF460, PEG3, ZNF543, ZNF304 ZIM2, ZNF547, ZNF835, ZNF548, ZNF17, ZNF71, ZNF749, ZNF772, ZNF470, ZNF419 ZNF773, ZNF549, ZNF550, ZNF416, ZFP28, ZIK1, ZNF530, ZNF134, ZNF211, AC005498.3 Negative: GSC, DICER1, DICER1-AS1, CLMN, SYNE3, SNHG10, GLRX5, TCL1B, TCL1A, C14orf132 ATG2B, DLGAP5, LGALS3, GSKIP, FBXO34, MAPK1IP1L, ATG14, AK7, SOCS4, TBPL2 PAPOLA, WDHD1, KTN1-AS1, GCH1, VRK1, SAMD4A, CGRRF1, SETD3, GMFB, CNIH1 PC_ 5 Positive: DUXA, ZIM3, ZNF264, AURKC, USP29, ZNF805, PEG3, ZNF460, ZNF543, ZIM2 ZNF304, ZNF547, ZNF835, ZNF548, ZNF17, ZNF749, ZNF71, ZNF772, ZNF419, ZNF773 ZNF549, ZNF551, ZSCAN4, ZNF550, ZNF211, ZNF134, ZNF154, ZNF416, ZNF470, ZIK1 Negative: MGMT, TCERG1L, GLRX3, PPP2R2D, BNIP3, DPYSL4, STK32C, LRRC27, PWWP2B, INPP5A NKX6-2, UTF1, VENTX, ADAM8, TUBGCP2, ZNF511, PRAP1, PCSK9, USP24, FUOM DHCR24, PRKAA2, ECHS1, PARS2, MROH7, TTC4, DAB1, MROH7-TTC4, PAOX, FAM151A Computing nearest neighbor graph Computing SNN Warning: Data is of class matrix. Coercing to dgCMatrix. Finding variable features for layer counts Calculating gene variances 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating feature variances of standardized and clipped values 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Centering and scaling data matrix | | | 0% | |======= | 10% | |============== | 20% | |===================== | 30% | |============================ | 40% | |=================================== | 50% | |========================================== | 60% | |================================================= | 70% | |======================================================== | 80% | |=============================================================== | 90% | |======================================================================| 100% PC_ 1 Positive: ZNF230, ZNF155, ZNF222, ZNF221, ZNF223, ZNF45, ZNF284, ZNF404, ZNF224, ZNF283 ZNF225, LYPD5, ZNF234, KCNN4, ZNF227, ZNF226, ZNF235, SMG9, ZNF233, PLAUR ZNF112, ZNF428, SRRM5, ZNF285, ZNF576, IRGQ, PINLYP, XRCC1, ZNF229, ZNF180 Negative: TUBB6, AFG3L2, IMPA2, MPPE1, SPIRE1, CHMP1B, AP005482.1, GNAL, PSMG2, NAPG CEP76, VAPA, PTPN2, RAB31, PPP4R1, SEH1L, RALBP1, FAM69A, RPL5, EVI5 MTF2, TMED5, GFI1, TWSG1, CEP192, CCDC18, RPAP2, DR1, FNBP1L, ANKRD12 PC_ 2 Positive: SORBS3, PPP3CC, SCD, BLOC1S2, SEC31B, SLC39A14, CWF19L1, NDUFB8, HIF1AN, CHUK PIWIL2, MRPL43, ERLIN1, POLR3D, CPN1, DNMBP, PHYHIP, ABCC2, BMP1, COX15 REEP4, CUTC, NUDT18, ENTPD7, SLC25A28, FAM160B2, GOT1, DMTN, CNNM1, CPEB3 Negative: SFXN1, HRH2, CPLX2, MSX2, THOC3, C5orf47, SIMC1, CPEB4, KIAA1191, BOD1 ARL10, STC2, NOP16, NKX2-5, HIGD2A, BNIP1, CLTB, CREBRF, FAF2, ATP6V0E1 RNF44, GPRIN1, RPL26L1, SNCB, ERGIC1, EIF4E1B, DUSP1, TSPAN17, SH3PXD2B, UIMC1 PC_ 3 Positive: TMEM134, CORO1B, PTPRCAP, AIP, PITPNM1, RPS6KB2, PPP1CA, CDK2AP2, RAD9A, GSTP1 CLCF1, NDUFV1, POLD4, NUDT8, SSH3, ACY3, ANKRD13D, ALDH3B2, KDM2A, UNC93B1 RHOD, NDUFS8, SYT12, TCIRG1, LRFN4, PC, CHKA, RCE1, C11orf80, C11orf24 Negative: C2CD5, ETNK1, SOX5, BCAT1, C12orf77, LRMP, CASC1, KRAS, AC087239.1, RASSF8-AS1 RASSF8, ERP27, ITPR2, ART4, FGFR1OP2, C12orf60, TM7SF3, WBP11, H2AFJ, AC024896.1 THNSL1, ENKUR, PRTFDC1, ARHGAP21, KIAA1217, HIST4H4, OTUD1, MSRB2, MED21, ARMC3 PC_ 4 Positive: RBM14-RBM4, RBM14, RBM4, CCDC87, CCS, CTSF, ACTN3, ZDHHC24, RBM4B, BBS1 DPP3, PELI3, SPTBN2, MRPL11, C11orf80, SLC29A2, RCE1, BRMS1, CD248, PC YIF1A, LRFN4, CNIH2, RAB1B, SYT12, KLC2, RHOD, PACS1, KDM2A, SF3B2 Negative: GID8, YTHDF1, NKAIN4, ARFGAP1, DIDO1, KCNQ2, TCFL5, EEF1A2, COL9A3, PPDPF OGFR, GMEB2, MRGBP, STMN3, RTEL1, SLCO4A1, RTEL1-TNFRSF6B, GATA5, RBBP8NL, CABLES2 RPS21, RAB11B, MARCH2, HNRNPM, ZNF414, MYO1F, RAB11B-AS1, ZNF558, ANGPTL4, AC010323.1 PC_ 5 Positive: RALGAPB, ACTR5, PPP1R16B, LBP, TGM2, FAM83D, DHX35, TOP1, RPRD1B, PLCG1 NNAT, BLCAP, CTNNBL1, TTI1, ZHX3, SRC, MANBAL, LPIN3, RPN2, EMILIN3 CHD6, MROH8, SRSF6, RBL1, SAMHD1, L3MBTL1, SOGA1, IFT52, DSN1, MYBL2 Negative: EPS8L1, PPP1R12C, TNNT1, TNNI3, DNAAF3, PTPRH, AFG3L2, ZCCHC7, SPIRE1, TUBB6 TMEM86B, IMPA2, AP005482.1, MPPE1, PAX5, PSMG2, CHMP1B, PPP6R1, CEP76, LMBR1 MELK, DNAJB6, RNF32, MNX1, UBE3C, NOM1, GNAL, PTPN2, LINC01006, AC073133.2 Computing nearest neighbor graph Computing SNN Warning: Data is of class matrix. Coercing to dgCMatrix. Finding variable features for layer counts Calculating gene variances 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Calculating feature variances of standardized and clipped values 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **************************************************| Centering and scaling data matrix | | | 0% | |======= | 10% | |============== | 20% | |===================== | 30% | |============================ | 40% | |=================================== | 50% | |========================================== | 60% | |================================================= | 70% | |======================================================== | 80% | |=============================================================== | 90% | |======================================================================| 100% PC_ 1 Positive: VCAN, XRCC4, SNX18, HSPB3, LINC01033, GMPS, SLC33A1, KCNAB1, ARL15, SSR3 TIPARP-AS1, NDUFS4, LACTB2, AC079807.1, EPCAM, MSH2, FBXO11, CALM2, MSH6, KCNK12 FOXN2, TIPARP, PPP1R21, TTC7A, GTF2A1L, STON1-GTF2A1L, STON1, MCFD2, FSHR, AC016722.2 Negative: ADORA2A, UPB1, GUCD1, SPECC1L-ADORA2A, SPECC1L, SNRPD3, GGT5, GGT1, SUSD2, PIWIL3 SGSM1, CABIN1, KIAA1671, CRYBB2, LRP5L, DDT, MYO18B, ASPHD2, HPS4, SRRD DDTL, FOSL1, XRCC1, GSTT2B, JRK, AP000350.7, C11orf68, PINLYP, AP000350.6, DFFB PC_ 2 Positive: TSKS, AP2A1, FUZ, MED25, PTOV1-AS1, PTOV1, PNKP, SYNGR4, AKT1S1, TMEM143 EMP3, TBC1D17, CARD8, IL4I1, ZNF114, LIG1, NUP62, PLA2G4C, ATF5, VRK3 MED31, TXNDC17, ZNF473, KIAA0753, WSCD1, NLRP1, KCNC3, MIS12, SIMC1, THOC3 Negative: TBL1XR1, NAALADL2, NLGN1, ECT2, LINC00501, NCEH1, ZMAT3, FNDC3B, PLD1, PIK3CA TNIK, KCNMB3, EIF5A2, ZNF639, RPL22L1, MFN1, CLDN11, SKIL, PRKCI, PHC3 SLC35A5, ATG3, CCDC80, BTLA, CD200, CD200R1, GPR160, GCSAM, GTPBP8, C3orf52 PC_ 3 Positive: RASEF, FRMD3, IDNK, UBQLN1, GKAP1, KIF27, C9orf64, HNRNPK, RMI1, SLC28A3 NTRK2, AGTPBP1, NAA35, GOLM1, ISCA1, DAPK1, CTSL, CDK20, SPIN1, NXNL2 C9orf47, S1PR3, CKS2, SECISBP2, SEMA4D, GADD45G, SYK, AUH, NFIL3, ROR2 Negative: PHOSPHO1, ABI3, ZNF652, GNGT2, PHB, NGFR, B4GALNT2, SPOP, IGF2BP1, SLC35B1 GIP, FAM117A, KAT7, SNF8, TAC4, UBE2Z, DLX4, CALCOCO2, DLX3, HOXB13 HOXB9, ITGA3, ZNF416, ZNF530, ZNF134, ZIK1, ZNF211, ZNF550, ZSCAN4, ZNF549 PC_ 4 Positive: RAB5C, HSPB9, KAT2A, GHDC, DHX58, NKIRAS2, STAT5B, STAT5A, STAT3, DNAJC7 CNP, ATP6V0A1, TTC25, NAGLU, HSD17B1, ACLY, COASY, MLX, KLHL11, PSMC3IP TUBG1, NT5C3B, TUBG2, FKBP10, PLEKHH3, CNTNAP1, HAP1, EZH1, RAMP2, GAST Negative: VGF, AP1S1, SERPINE1, TRIM56, MUC12, ZNF628, NAT14, ISOC2, MUC3A, FIZ1 UBE2S, ZNF524, RPL28, ZNF865, ACHE, ARL6IP5, LMOD3, COX6B2, FRMD4B, MITF FOXP1, EIF4E3, TMEM150B, UFSP1, GPR27, RYBP, HSPBP1, SHQ1, SRRT, CYCS PC_ 5 Positive: TFAP2A, SLC35B3, CCR7, EEF1E1, IGFBP4, GCNT2, EEF1E1-BLOC1S5, BLOC1S5, BLOC1S5-TXNDC5, SMARCE1 TXNDC5, C6orf52, BMP6, SNRNP48, TOP2A, KRT10, PAK1IP1, DSP, TMEM14C, TMEM99 RIOK1, RARA, TMEM14B, SSR1, KRT23, SYCP2L, MAK, RREB1, CDC6, KRT15 Negative: ZNF613, ZNF649, ZNF350, ZNF615, ZNF614, ZNF432, ZNF841, ETFB, ZNF616, VSIG10L ZNF836, CTU1, PPP2R1A, KLK13, ZNF766, KLK11, KLK10, ZNF480, KLK8, ZNF610 DCAF6, KLK7, MPC2, GPR161, ADCY10, TIPRL, MPZL1, RCSD1, SFT2D2, CREG1 Computing nearest neighbor graph Computing SNN